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Have You Planned Your Summer? 


Why not join your friends and fellow-workers at Syracuse University 
Syracuse, New York, July 8-19. In addition to two weeks of learning and en. 
joyment, you can earn two credit hours of university work—undergraduate o; 
graduate. 

On the campus of Syracuse University you will meet old friends and mak 
many new ones. You will enjoy the entertainment which is being planned for 
mermbers of this Eighth Annual Conference on Elementary Education, At. 
tendance at previous conferences has ranged from 200 to 675. Don’t mis 
enrolling for this two weeks’ meeting. See pages 24-26 of this bulletin for de. 
tails. Mail the reservation blank on page 26 today ! 
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PERTINENT PARAGRAPHS mga 6 wie 


NEA Research Division 





There is an old adage that it is 
difficult in fair weather to get a man 
to fix the leak in his roof. It is some- 
times equally difficult to get organ- 
ized teachers’ groups interested in 
_ preparing well in advance for sessions 
of the state legislature. Many of the 
items treated in previous articles in 
this series have had legislative sig- 
nificance, e.g., state certification stand- 
ards for principals and state minimum 
salary laws. Some of the items in the 
present article should also be con- 
sidered from the legislative angle. 


Exchange of teachers 


Many principals are interested in 
having exchanges of teachers with 
foreign countries. This plan is being 
widely discussed today in professional 
articles and is generally recognized 
as potentially important as a means of 
building international goodwill. How- 
ever, a recent tentative exploration of 
state laws by the NEA Research 
Division indicates that certain legal 
impediments exist. For example, at 
least one state has a law forbidding 
the employment in any educational in- 
stitution of a person who is not a 
citizen of the United States. The state 
teacher oath laws in several states 
make it difficult, if not impossible, for 
a foreign visitor to accept the re- 
quired conditions of employment. 
Looking at the problem from the 
angle of the American teacher, there 
are difficulties with respect to state 
laws and local rules governing tenure, 
leaves of absence, retirement, salary 
schedules, contracts, and certification. 


Apparently only California, Hawaii, 
New York, Oregon, and Puerto Rico 
have taken steps to pass permissive 
legislation to facilitate the exchange 
of teachers. Even in these areas the 
full import and the application of the 
permissive laws are not entirely clear, 


Schoolboard hearings 


From several years of analysis of 
tenure cases reaching the highest state 
courts the NEA Research Division 
concluded that many cases arose orig- 
inally or were complicated by factors 
that could be controlled or eliminated. 
Many disputes, for example, arise 
from faulty contracts or from lack of 
knowledge on the part of classroom 
teachers and administrators. One 
other avoidable difficulty often is the 
careless way in which schoolboards 
have conducted their hearings. After 
analysis of state laws and court cases 
the Division has produced a bulletin 
entitled Essentials of a Proper School- 
board Hearing. Since principals are 
sometimes involved in such hearings 
the bulletin should be of interest to 
them as well as to members of school- 
boards and school superintendents. 


Veteran education 


A postcard inquiry in November to 
city-school systems over 5000 in popu- 
lation showed nearly 13,000 veterans 
enrolled in 958 cities replying. Thirty 
percent of the cities (mostly under 
30,000 in population) reported no 
veterans enrolled. Apparently most of 
the problems of veteran education 
now rest upon secondary and college 
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institutions. Although the inquiry did 
not go into the point, we wonder what 
is happening to those veterans who 
need instruction in elementary-school 
subjects. Is the problem of veteran 
education merely one of retraining in 
certain basic vocational skills in antic- 
ipation of quick employment ? 


Discussion booklets 


Principals looking for materials for 
their faculty meetings should find 
some help in the series of pamphlets 
being developed by the NEA Re- 
sarch Division for the NEA De- 
partment of Classroom Teachers. 
These bulletins are written in some- 
what popular style and contain re- 
sarch information, questions for dis- 
cussion, and_ selected references. 
Topics dealt with to date include: 
teacher tenure; teacher retirement ; 
planning postwar education; paying 
for schools ; ethics for teachers ; credit 
unions; and leaves of absence. The 
dghth bulletin now in process dis- 
cusses salary scheduling procedures. 
The bulletins are being used in teacher 
education classes and may be useful 
in principals’ local study groups. 


State legislative procedure 


A study of the legislative procedures 
of state education associations, al- 
though prepared for the restricted use 
of state staffs, may be referred to here 
in general terms. In only 10 of the 
39 states reporting are there staff 
workers specifically assigned to state 
legislation ; only 2 are full time in this 
work. In most states legislative pro- 
gtams are shared responsibilities in- 
volving the staff and field committees. 
A heavy load usually falls upon the 
state secretary. The role of local as- 
sociations falls into three categories : 


(a) building professional under- 
standing ; (b) developing lay support ; 
and (c) informing legislators in their 
respective districts. Individual mem- 
bers of state associations appear to 
serve by (a) keeping themselves in- 
formed, and (b) performing specific- 
ally delegated tasks. Of the many 
public relations devices in use, “per- 
sonal work” is considered the most 
effective. 


Looking ahead 


Many groups are giving thought to 
child and youth problems and to 
future school opportunities. A booklet 
Looking Toward Tomorrow's Educa- 
tion (prepared by the NEA Research 
Division and issued by the Joint Com- 
mittee of the NEA and National Con- 
gress of Parents and Teachers) will 
be especially helpful to elementary 
school principals. From the Children’s 
Bureau of the Department of Labor 
has come State and Community Plan- 
ning for Children and Youth (Pub- 
lication 312) and Building the Future 
for Children and Youth (Publication 
310). The American Country Life 
Association (with the help of the 
NEA Department of Rural Educa- 
tion) has issued Youth in the Rural 
Community—a bulletin to orient rural 
youth with respect to education, em- 
ployment, and community life. From 
the Public Affairs Committee (30 
Rockefeller Plaza, New York 20), 
has come a discussion pamphlet, 
Youth and Your Community, and 
from the New Jersey State Education 
Association, What Should Chidren 
and Parents Expect of Each Other? 


4 ac A abba 
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The Elementary School Testing Program 


C. W. Martin 


Professor of Education and Director of Research, State Teachers College, 
Kirksville, Missouri 


At this day and time most of the principals know what a testing program 
should be, or they have a pretty good idea of the nature of such a program 
because they have had training in measurements and statistics, and they see 


‘the need for better measurement in the schools. School administrators believe 


in scientific measurement in school and know that this is an important phase 
of school work. Yet, in the opinion of the writer, most of the principals are not 
proud of the programs of testing that they now have or that they have had in 
the past. It is not easy to tell why testing programs have been poor, inefficient, 
and inadequate. The testing program for a school is too important to be neg- 
lected or to be poorly carried out if attempted at all. The purpose of this article 
is to stimulate better measurement in the elementary school. 

What should a testing program be? This is a question that is not easy to 
answer specifically and definitely. A testing program that is thorough, usable, 
and beneficial will have various phases and will be almost a continuous process 
and a cumulative affair. It will be such as to show the strong and weak points 
of individuals and of groups of individuals. It will use a variety of devices, tests, 
and instruments for giving a clear, complete, and well-rounded picture of in- 
dividuals and of groups. It will suggest, devise, and carry out remedial pro- 
cedures to the end that learning is promoted. 

What are some of the phases of a testing program? In this brief article no 
attempt will be made to be all inclusive but merely to point out some of the 
bigger and more common phases of such a program. 

One phase of the program must be the physical examination. Looking after 
the health and physical welfare of the pupils in many school systems.is done 
rather creditably and systematically, but still too often and in too many schools 
this is not the case. Not all the troubles that children have in school are caused 
by mental deficiency. Too frequently that child who appears to be lacking in 
ability is accepted by the teachers as dull when in reality he is not dull at all 
but is suffering from some physical deficiency that makes him appear to lack 
mental ability. The child who habitually miscopies materials from the black- 
board or frequently mispronounces words in reading may be neither dull nor 
inattentive; his trouble may be poor vision. Physical defects and difficulties 
need to be known and in so far as possible the proper remedial measures taken. 
Physical examinations should be made by competent physicians and nurses, 
but teachers should be trained in detecting symptoms of physical difficulties; 
the child’s physical welfare cannot be forgotten after he has had a physical 
examination, but must be continuously observed and frequently checked. 

Another phase of the testing program certainly is the measurement of 


mentality. The school should feel pretty certain concerning the general intel ] 
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ligence of every pupil, and the school can be pretty certain about the general 
intelligence of its pupils. There are now many splendid group tests of intel- 
ligence specifically adapted to age groups that are excellent in quality and easy 
to administer, score, and interpret. With most of the pupils good standardized 
group intelligence tests will provide all that is desired in determining the 
general intelligence, but there will remain special cases that should be tested 
with special tests, such as the 1937 Revision of the Binet Scale or some good 
non-language type intelligence test. Poor readers in our schools are numerous, 
and with those pupils who read poorly the verbal-type intelligence tests should 
not be relied upon exclusively because of their lack of validity for such pupils. 
In some cases even performance tests can and should be used. It seems to the 
writer that until the physical ability and the general mental ability of each 
child is known the testing program lacks the basic foundation stones on which 
it must be built. Other testing will follow, but these are the two feet on which 
the program stands. 

A third phase of a testing program should be the survey type measure- 
ment of achievement by use of the battery-type standardized tests such as the 
Stanford Achievement Tests. (This battery is mentioned only as an example 
and its use is not advocated above that of any other good battery.) Such tests 
provide a splendid measure of the achievement in a whole range of subjects 
in one booklet and at ‘one time of testing. The results are obtained on separate 
basic subjects for each pupil as well as a measure of general achievement by 
using the combined scores on all of the subjects. Such tests provide interpreta- 
tion devices such as the educational profile chart which makes the results readily 
usable and understandable by the teachers. These tests should be used no less 
than twice each school year, near the beginning and near the end of the school 
term, using a different form of the test at each time of testing. 

Battery type testing is done for the purpose of discovering the larger 
strengths and weaknesses in individuals and in groups. Such measurement 
may show that practically an entire group is low on some particular subject. 
In that case, the next step is to find out why that condition exists and to go 
about correcting it as quickly and as completely as possible. Or, again, such 
testing may show that an individual pupil ranks well in all but one or two 
subjects. The next move, then, is to find out the trouble and right it if possible 
through correct remedial practice. To use a specific illustration, a child shows 
up well on the battery except in arithmetic, or particularly arithmetic com- 
putation. Then further testing of that individual in the field of arithmetic 
computation—testing that is diagnostic in character—must be carried out to 
discover just what the difficulties are in order that the proper remedial meas- 
ures may be used. And there is no question that the remedial work is the im- 
portant part of the program. Testing is done as a means to the end that defects 
and deficiencies can and will be corrected. 

Certainly no testing program in the elementary school could be considered 
adequate without making thorough investigation of the reading ability of the 
pupils. Teachers and administrators know that a large proportion of our chil- 
dren cannot read or, at best, they read with great difficulty. Much of the dif- 
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ficulty and many of the failures even in high school and college are due to the 
lack of ability in reading. Yet, what has been done about this serious deficiency? 
How many elementary schools have a real reading program which, of course. 
must include a variety of testing and the use of effective devices and instry- 
ments now available for pointing out specific reading difficulties? It seems to 
the writer that the elementary school would do well to have an ever-present 
testing program in reading which would include reading readiness tests, survey 
type reading tests, diagnostic reading tests, the measurement of eye move- 
ments, the use of the ophthalmograph and the metronoscope and other means 


_ and instruments for the identification of reading difficulties. Certainly reading 


is too important to be neglected in any of its phases, whether it be oral reading, 
silent reading, work-study type reading, recreational reading, or whatever 
type it may be or whatever function it may perform. Reading abilities and dis- 
abilities of children should be known and continuous efforts exerted to make 
each child able to read easily and well. 

Although reading is unquestionably the most ifnportant subject in the 
elementary school, this does not mean that other subjects and other basic skills 
should be neglected. A testing program in the elementary school should in- 
clude not just reading but arithmetic, science, social science, and all other 
subjects. These are sometimes classified into fields of knowledge as English, 
science, mathematics, and social science. In all of these fields testing and 
remedial work should be a regular part of the school program. 

As principals know, a measurement program means more than testing. 
Yes, testing is a part of it—a very vital and worthwhile part ; but the program 
should be much broader and more inclusive than just the use of tests. The 
program should include a study of home environment through home visitation, 
the use of information blanks, conferences with the pupil and parents, and any 
other means available. The emotional and social adjustment of the pupils, 
especially of those who show any abnormalities, should be studied and proper 
steps taken to aid the child in making proper adaptations and adjustments. 
Emotional troubles are quite intangible and not so easily diagnosed, but never- 
theless are the source of difficulty of many children. Some of the emotional 
difficulties can be discovered through the use of standardized scales, but much 
can be learned through conferences and through built-up case histories. The 
recreational activities of children should be given attention, and proper super- 
vision and direction should be provided. Here is a fine opportunity to teach 
children to get along with other people, to see their shortcomings in that 
respect, and to see how they are adjusted socially. Some children may be socially 
ostracized and may become shy, retiring introverts unless proper safeguards 
are set up and steps taken to avert it. All of these items and more are phases 
of a thorough testing program. 

How is the program to be administered ? No doubt some one person should 
direct the program, but the teachers must have a very active part in it. Prob 
ably in most schools the principal will head up the program and direct it. He 
may secure aid and advice from a specialist in the field of measurement an 


guidance. This aid may go to the extent of having a specialist visit the school § 
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number of times to consult with the teachers concerning the program, its inter- 
pretation, and its various phases. Or, again, if a school decides to intensify its 
measurement program on some particular subject each year, then the specialist 
used should be a specialist in the particular field of study that year. However 
well the principal may know measurement, unless the teachers are taken into 
the program in administering, scoring, and interpreting the tests, and by means 
of teachers’ meetings and counseling sessions with other teachers, with the 
principal, the specialist, and the pupils, the program cannot be all that it should 
be. Such a testing program must become a vital part of the teachers’ work in 
that school, and they must see that all of this is done in an effort to help pupils ; 
all is done to make it possible for teachers to do a more effective job of teach- 
ing, that boys and girls may learn to live more, abundantly. ’ 





Some Aspects of Testing 


M.J. Nelson 


Dean of the Faculty, Iowa State Teachers College, Cedar Falls, Iowa 


History records that the testing movement, which was then in its rather 
early stages, received a considerable impetus as a result of the use of tests in 
the first World War. While it is still too early to discern clearly the changes 
in education which will result from the educational practices evolved in the 
armed services, it seems likely that the greatest impetus given to the testing 
movement by the second World War will be in the field of guidance. In guid- 
ance, tests will, of course, play a prominent although subsidiary role. One who 
believes in the value of tests must have a feeling of gratification as he notes 
how much reliance both the Army and the Navy placed on test results. And 
if one is inclined to fret about the possibility that too much reliance was some- 
times placed upon test results to the exclusion of other available types of in- 
formation, one is likely to excuse it on the ground that time was short. A war 
was going on which must be won and some information, even though scanty 
or incomplete, concerning the men who were to be put in responsible positions 
for its prosecution was certainly better than no information at all. 

In these days following the cessation of hostilities, there is going on another 
vast experiment in guidance in which tests again will play a major role. I refer 
to the work being done through the guidance centers of the Veterans Admin- 
istration. No longer is there the tremendous urgency for haste ; therefore, the 
ex-service man (or woman) who presents himself (or herself) to one of these 
centers should and probably will receive enough attention both from the 
psychometrists and from the advisers so that the very best advice possible will 
be given. Educators and even the general public will undoubtedly follow this 
vast experiment in guiding and, in some instances, salvaging those to whom 
we owe so much. 

It is too early to see what influence these experiments will have on pro- 
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cedures in the public schools. It is hoped that organized programs of guidance 
in the elementary schools will come to be as common as are testing programs, 
The emphasis at this level should not be on vocational guidance ; rather, these 
programs should seek to learn what are pupil strengths and weaknesses and 
should then seek to develop those characteristics in which the individual pupij 
exhibits potential strength and to correct discovered weaknesses. A report! by 
Ethel Kawin on “Guidance in the Glencoe Schools” indicates the value of the 
early detection of deviations in normal development even as early as the first 
or second grade. In addition to data from one or more intelligence tests and 
from a readiness test, help in guidance is obtained from information concerning 
“the child’s home background and family relationships, his developmental 
history, his pre-school interests and experiences, the problems that he has 
presented at home, and what his parents hope, he may gain from his school 
experiences.” 

Procedures used in the guidance program have proved helpful in (1) redue- 
ing the number of reading problems in the seventh grade, (2) preventing the 
development of serious behavior or personality disturbances at adolescence, 
(3) limiting the problems of adjustment of children with less than average 
ability, (4) making decisions concerning the wisest procedure in making 
grade promotions or retaining pupils in a given grade, and (5) providing a 
suitably enriched program for the gifted or “superior” children. The program 
demonstrates the wisdom of using test results as but one of several kinds of 
data that, with the expenditure of some additional time and effort, may readily 
be secured. There has been a great deal of haphazard testing with paper and 
pencil tests which furnish valuable data, to be sure, but which have all too 
often not been used in an effective manner. The writer hopes that many other 
school systems will be stimulated to set up well-organized guidance programs 
in the elementary school which will make for the development of better physical 
and mental health, greater ability to cooperate, and the early establishment of 
appropriate goals in education. 

That personality adjustment of elementary school pupils can be helped by 
the typical classroom teacher without special training and without the aid of 
specialists is indicated by another study in the Appleton, Wisconsin, public 
schools.? Such studies indicate the need for personality rating scales. But it is 
also evident that other instruments of evaluation besides the commonly used 
pencil and paper test must be employed and, where they are already employed, 
must in most instances be perfected. 

It is to be hoped that with the return to peacetime living renewed emphasis 
will be placed on educational experimentation and that such experimentation 
will result in (1) vast improvement in the tests which are already in us¢, 
(2) the development of new tests in areas which up to now have largely evaded 
measurement, and (3) much more intelligent and systematic use of the in- 
formation obtained by both measurement and observation. In an attempt t0 

1 See Journal of Educational Research, 37: 481-492 (March, 1944). 


2Flory, Allen, and Simmons, ‘“‘Classroom Teachers Improve the Personality Adjustment of Their 
Pupils.” Journal of Educational Research, 37: 1-8 (September, 1944). 
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dispel the notion that almost everything has been done in certain areas, I should 
like to devote my remaining space to some problems in two areas of special 
interest to me and which have had as much attention by measurers and experi- 
menters as any other, namely, intelligence and reading. 

In the area of intelligence testing, one of the greatest needs is for more 
adequate tests of the various aspects of intelligence. Existing tests have done 
a fairly good job of giving the examiner an over-all index to the child’s mental 
ability, but when it has come to measuring aspects of intelligence such as 
potential ability in linguistics, in reasoning, or mathematics, the all-too-common 
practice has been to devote from two to four minutes to the search for such 
abilities, a period which is certainly insufficient for reasonable reliability even 
if the materials should prove to be valid. 

In addition to the aspects of intelligence mentioned above, more attention 
must be given to developing measurement of potential ability in music and 
mechanics; to evaluating social intelligence ; to measuring personality charac- 
teristics ; and to experimentation designed to determine the optimal means for 
developing these characteristics. To neglect evaluation is to prevent teaching 
from becoming as effective as it might be. Many will prefer to use the term 
“special aptitudes” for what I have called potential abilitiespand with these I 
would not quarrel. Whatever they are called, they should not be neglected. 

Of direct concern to the elementary teacher ‘should be attention to the 
inverse relationship existing between achievement quotients and intelligence 
quotients. It is a well-known fact that pupils of high ability tend to achieve 
less in school than might reasonably be expected of them; whereas pupils of 
low ability frequently do better than we have reason to expect. I am aware of 
the validity of the arguments that have been advanced against the accomplish- 
ment quotient, but the fact remains—as almost everyone will admit—that 
entirely too many pupils of excellent ability develop inferior habits of work. 
The extent to which these habits carry over to adult activities is unknown, but 
it seems logical to suppose that some of them do; and that thereby results a 
considerable loss in human productivity. 

Turning our attention to reading, there seems to be a definite need for 
closer cooperation with professional eye-specialists. Their work has been almost 
exclusively with such correction as will allow the patient to make the neces- 
sary accommodation for clear vision. Not always do the lenses used make for 
comfortable reading over any extended period. Yet, if the pupil is going to 
enjoy reading, he must suffer no undue fatigue after fifteen minutes of con- 
tinuous reading. Unless reading is comfortable, it will not be able to compete 
on a satisfactory level with other attractions such as the radio and television, 
the motion picture, and various unprofitable ways of using leisure time. If eye- 
specialists would give more attention to testing for esophoria or exophoria, for 
various types of muscular imbalance, for disparity in the size of retinal images, 
and other defects in addition to myopia, hyperopia, and astigmatism, they might 
be very helpful to many pupils. Except in pretty completely equipped clinics, 
such tests can hardly be given by the school staff; hence, the necessity for 
interesting specialists in the problem. 
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A great deal of work needs to be done in the field of reading readiness, 
As a recent writer points out, we may know a great deal about when it js 
possible to teach the child to read, but we know much less about when it js 
profitable to teach him to read. If he is correct in his assumption that the best 
age is at about nine, rather than at six, we might save a great deal of work 
and, what is more important, a considerable amount of strain on the nervous 
system of many children by postponing the teaching of reading until the child 
has reached the third grade level. The question as to what children may 
profitably do before the third grade is another one with which such a program 
would need to cope. That it is not insurmountable, however, is indicated by 
the fact that nursery school and kindergarten teachers and some first grade 
teachers have found profitable activities for the children under their care 
without resorting to the teaching of reading. 

It is my hope, then, that 1946 will see the beginning of a new era of experi- 
mentation, with elementary teachers and principals taking the initiative in 
solving their many problems. If this comes to pass, the desired development 
of the testing movement is certain to come; for no program of experimentation 


can be complete without evaluation. 
e 





Toward a Science of Education 


Edward A. Lincoln 
Consulting Psychologist, Halifax, Massachusetts 


A casual glance at the history of any science, in either its pure or its applied 
aspects, will reveal that advances have come about chiefly as the result of the 
invention and perfection of instruments of precision for use in investigation 
and measurement. As new and better instruments were developed, the scientist 
could better attack the unsolved problems lying before him, and was able to 
push on to greater knowledge and control. 

Astronomy, for example, was studied by the Babylonians and Egyptians 
from four to five thousand years before the Christian era. But the progress 
made in all these centuries was small and continued to be so until the inven- 
tion of the telescope. Then slowly the old incorrect ideas of the Ptolemaic 
system were replaced, and as bigger and better telescopes were made, the 
advances of the science went on apace. The use of the spectroscope to deter- 
mine the composition of the heavenly bodies and the employment of cameras to 
photograph their movements have added still more to the body of facts and 
laws in this science. 

All the biological sciences owe much of their present state of development 
to the invention and improvement of the microscope. There are very definite 


3 Doll, Edgar A., ‘‘Psychological Moments in Reading.” Baltimore Bulletin of Education, 23: 46-53 


(November-December, 1945). 


—— 
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limitations to the knowledge that can be gained through observations by the 
naked eye, and whole new worlds have been opened as the magnifying power 
of lenses and lens combinations has been increased. 

Physics and chemistry too have employed the microscope with striking 
results. Also, in these fields there have been developed the delicate scale which 
will weigh a pencil mark on a paper, the carefully graduated burette, the 
thermometer, the voltmeter, the ammeter, and so on. We have pressure and 
temperature control devices, spectroscopes, polarimeters, X-ray machines, 
and many other similar devices and tools. Without them we would still be living 
in the age of alchemy rather than in the age of the broken atom. 

It is worthy of note also that in the early days of a science much of the 
work consists of naming, describing, defining, classifying, and like activities. 
This is perhaps most clearly seen in the early work of the. botanists and 
zoologists, but it took place in all sciences. Furthermore, units of measure- 
ment have to be invented as new instruments and techniques are developed. 
This our scientists have done, as is readily apparent when we think about the 
origin of such terms as grams, centimeters, ergs, horsepower, volts, amperes, 
foot-pounds, dynes, light-years, decibels, atomic weights, and many more. All 
this is necessary before the scientist can go on to discover, develop, and apply 
the facts, principles, and laws which make up the essence of the particular 
science in which he is working. 

Standardized tests are the instruments of precision through which educa- 
tion can become a science. They offer the means for accurate measurement of 
group and individual traits which could only be estimated before the tests were 
invented. Such measurement is as necessary in scientific education as the 
precise weighing of a precipitate is necessary in chemistry when the atomic 
weight of an element is to be found, or as the accurate determination of gas 
volumes when the physicist demonstrates Boyle’s Law. 

This descriptive measurement, or measurement of present status or condi- 
tion, is important in the survey use of tests when facts are desired about classes, 
grades, schools, and school systems. It is even more important when these are 
used for individual diagnosis of those pupils who are problems because of 
learning difficulties or atypical behavior. In the remedial use of tests, it is neces- 
sary to know the level of the pupil’s ability at the beginning of his special train- 
ing and to have an accurate measure of his progress as the work goes on. 

It is probably true, however, that the greatest value of the standardized 
tests in the development of a science of education lies in the fact that through 
their use it is possible to carry on scientific experimentation. This is the chief 
means by which a science grows. The two indispensable attributes of such ex- 
perimentation are, first, the control of all materials and conditions ; second, the 
accurate and reliable measurement of products or results. 

This will be clear if we consider the work of the chemist in his laboratory. 
He can obtain in chemically pure form the substances with which he works, 
and can weigh out exact amounts of them on delicate balances. They can be 
dissolved in chemically pure solvents at any given temperature and under any 
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desired atmospheric pressure. As his investigation proceeds, he can collect aij 
gases, liquids, and solids and can measure them with his precision instry. 
ments to a high degree of accuracy. Thus he has practically complete contro} 
of all factors entering into his experiments, and with his measuring devices he 
can determine exactly what takes place. 

The standardized tests are important in educational experimentation be. 
cause they provide measuring devices which approach the precision of the 
chemist’s instruments. Through their use the experimenter is able to get fairly 
exact knowledge of the traits of the individuals and groups with which he js 
‘working. He can even assemble groups of pupils with characteristics to suit 
his purposes. For example, he can obtain two or more groups which are sub- 
stantially alike in general mental ability and skill in reading. 

Standardized tests also make possible the accurate measurement of changes 
brought about by the application of various experimental factors. Thus a 
change in arithmetical ability can be measured very accurately by two adminis. 
trations of a standardized arithmetic test. Likewise, the change in competency 
in reading after a certain amount of drill can be measured reliably by the use 
of standardized reading tests and, furthermore, the amount and nature of drill 
can be closely controlled by the use of practice tests. Thus it is possible to set 
up experiments to study the efficacy of various methods of teaching, the 
effects of forms of school organization, the value of teaching devices, the 
efficiency of teachers, the utility of textbooks, and other similar problems. 

The experiments described so far deal with the application of some factor 
or factors, followed by the notation and measurement of the results. Some- 
times in educational research we are faced with a condition or result and wish 
to experiment to discover the cause. The history of medicine is full of investiga- 
tions of this sort which have been carried on to determine the causes of various 
diseases. Experimenting such as this is especially needed in education for the 
study of the causes of many disabilities and failures. Standardized tests are 
very necessary in this work because through their use the various factors in 
the situations can be measured and controlled. 

It cannot be denied, of course, that the experimenter in the field of educa- 
tion is handicapped by the fact that he deals mostly with human beings rather 
than with inert substances. This condition will always impose some limita- 
tions upon the range and techniques of his work. However, the tests and scales 
which have been developed give him a degree of knowledge and control far 
beyond what he had a generation ago. And as the makers of tests go on with 
their invention and improvement of measuring instruments, we shall approaci 
nearer and nearer to a scientific education, firmly grounded on experimentally 
determined fact. 





) END to Headquarters by March 15 the names and addresses of the newly elected 
president and secretary of your principals’ club or association. These names wil 
be included in the directory of the twenty-fifth Yearbook of the D.E.S.P. 
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New Tests for New Objectives in the 
Elementary School 


J. Wayne Wrightstone 


Assistant Director, Bureau of Reference, Research and Statistics, Board of Education, 
New York, New York 


In recent years the modern elementary school curriculum has been re- 
constructed to include newer and more comprehensive objectives of instruction. 
The emphasis upon mastery of information has been supplemented by such 
newer objectives as pupil growth in attitudes, interests, powers of critical 
thinking, work-and-study skills and personal-social adaptability. This recon- 
struction of the curricular objectives has demanded a corresponding change 
in techniques of evaluation. 

Some persons claim a test of subject matter’ mastery is sufficient for 
evaluating pupil growth. Common sense and objective evidence show that a 
comprehensive evaluation of pupil growth cannot be obtained from the admin- 
istration of a few tests of recall and recognition of subject matter. If the teacher 
or school officer wishes to appraise the major objectives of pupil growth, he 
must be able to describe pupil progress not only in acquisition of information 
but also in growth of related interests, of desirable attitudes, of work-and- 
study skills, of powers of critical thinking, and of adaptability in personal-social 
relationships. Since these newer objectives of teaching have been emphasized, 
a corresponding growth of instruments of appraisal has been observed. Thus, 
the teacher or school executive discovers the appearance of attitude and opinion 
scales, of interest inventories, of tests of basic study skills, of tests in critical 
thinking, and of inventories and anecdotal records which are designed to meas- 
ure the emotional and social adjustment of pupils. 

Evaluating Functional Information—School executives and teachers are 
familiar with the new-type objective tests which measure the acquisition of 
information and related skills in reading, arithmetic, spelling, history, geog- 
raphy, science, industrial arts and fine arts. Any well-known and recent book 
on “tests and measurements” will provide a wealth of suggestions about tests 
of this sort. At the elementary school level, for example, batteries of achieve- 
ment tests such as the Stanford,’ the Metro politan,! Modern School,” Progres- 
sive? and Unit Scales of Attainment* are available. 

Evaluating Growth in Work-Study Skills—Work-study skills, so far as 
they have been defined for testing and appraisal purposes, are usually identified 
with the ability to read maps, graphs, charts, and tables, to use the table of 
contents and the index of a book, and to find items of information in reference 
books. In addition, modern elementary schools are placing an increasing 


» World Book Company, Yonkers, N. Y. 
; Dureau of Publications, Teachers College, Columbia University, New York, N. Y. 
- Southern California School Book Depository, Los Angeles, Calif. 

Educational Test Bureau, Minneapolis, Minn. 
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emphasis upon effective use of the school and local libraries. This use involyes 
such skills as knowing the effective use of library privileges, the techniques of 
withdrawing and returning books, the numbering or filing system of the books, 
and so on. 

At the elementary school level, the most comprehensive tests of work- 
study skills now available are the Jowa Every-Pupil Tests of Basic Study 
Skills.” Tests on the use of the library at the elementary school level appear 
under the title, Peabody Library Information Tests.® 

Evaluating Growth in Attitudes—One outcome expected from newer 


curricular and instructional practices is the development of desirable social, 


scientific, and esthetic attitudes. Some attitudes which teachers encourage in 
pupils are specific; others are general. Unfortunately there are few, if any, 
published scales of attitudes for elementary school pupils. Most published at- 
titude scales are for secondary school pupils. 

Beliefs and attitudes toward ideas, persons, and phenomena were meas- 
ured by a generalized attitude test, especially constructed for studies in the 
New York City schools. In this test the pupil is asked to indicate by + or - 
whether he agrees or disagrees with such statements as: 


The farmer is not as happy as the city worker...........-.2........-20-0c2-0c00-es0eeeeeeeeee (‘4 
Most people in other countries are not as bright as Americans........................ (‘3 
Chinese and Japanese people work as hard as white people........................... (4 


This test of civic beliefs allows expression of attitudes toward various races, 
customs, and ideas. It includes attitudes at an elementary level of aspects of 
socio-economic topics, such as transportation, communication, commerce, 
farming, food, and housing. All these topics, toward which opinions and at- 
titudes are expressed, are derived from the curriculum content of units oi 
study in elementary schools. 

Formal tests for measuring attitudes may be supplemented by evidence 
about a pupil’s attitudes as reported in anecdotal records written by the teacher 
and based upon her observation of the pupil’s actions, conversations, discus- 
sions, written statements, or reports. 

Evaluating Growth in Interests—For purposes of this discussion, interests 
may be defined as those drives which lead the individual to various preferences 


. in effort and conduct. In New York City an interest inventory was devised 


and has been used in several studies to discover pupil preferences. Sample items 
are given below : 


L means like; J means indifferent to or uncertain about; D means dislike. 





@ 
L I D 
(a) To listen to radio news.................................. ¢ ( ) 3 
(b) To read about wars and battles.................... € 3 ( ) ea 
(c) To listen to dance music......................-.-+-+-+ Bod e236 
(d) To draw or to paint pictures........................ t) — (4 
5 Houghton Mifflin Company, New York, N. Y. 
6 Educational Test Bureau, Minneapolis, Minn. 
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Another technique of evaluating interests, used in some of the New York 
City elementary schools, is a pupil log or diary in which the pupil lists the 
books and pamphlets that he has read or consulted for a unit of work. Each 
pupil lists also all reports that he writes and lists any construction work upon 
which he has engaged. 

Evaluating Growth in Critical Thinking—Every progressive teacher or 
administrator believes in the development of pupils’ powers of critical think- 
ing. This objective is prominent, especially in modern elementary science and 
social studies courses. From the work that has been done both in curriculum 
and in evaluation, several aspects of thinking may be tested by prepared scales. 
In elementary school social studies, a test is available for three aspects of 
critical thinking, namely, (1) obtaining data, (2) the interpretation of data, 
and (3) the application of principles and generalizations to new situations. The 
name of the test is Test of Critical Thinking in the Social Studies." 

Ability to draw conclusions or to make inferences from facts and materials 
read in the social and natural sciences becomes increasingly important in the 
preparation of pupils for everyday living. To be sure, the interpretation of 
data and the application of principles have been tested more or less incidentally 
and often in a haphazard manner by essay examinations. 

Evaluating Growth in Personal-Social Adaptability—Teachers and ad- 
ministrators recognize the importance of evaluating the personal-social adapt- 
ability, or personal and social adjustment of children. For appraisal of personal 
and social adjustment, a variety of methods may be used. These range from 
the free association methods, self-descriptive questionnaires and psychoneurotic 
inventories to rating scales, anecdotal records and behavior descriptions, in- 
cluding the case study methods. 

Anecdotal records have achieved some popularity as a method of record- 
ing and evaluating personal and social characteristics of children. The teacher 
makes anecdotal records of the actual behavior of selected pupils. The note or 
record is a concise description of the behavior, e.g., “John tore a page from 
Roy’s book” and not the teacher’s interpretation of this behavior, e.g., “John 
was angry and irritable.” Interpretation should come after several weeks or 
months of anecdotal records. The following example illustrates three observa- 
tions by a teacher of a ten-year-old pupil: 


September 17—Mary was in tears when she failed to solve an arithmetic 
problem correctly. 

October 6—Mary refused to take part in playground games because 
she was not chosen as leader. 

October 23—Mary called Jane a “big sissy” when Jane prepared an 
elaborate report for social studies. 


Of practical value for evaluating personal-social adaptability is the rating 
scale. Several of these have appeared at the elementary school level, including 
the Haggerty-Olson-Wickman Behavior Rating Scale® and the Winnetka 





7 Bureau of Publications, Teachers College, Columbia University, New York, N. Y. 
8 World Book Company, Yonkers, N. Y. 
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Scale for Rating School Behavior and Attitudes.® Of self-descriptive per. 
sonality scales at the elementary school level, the California Test of Por. 
sonality is recommended. It may be used as a screening test to discover 
probable cases of personality maladjustment. The results should be verified 
by teacher observation and anecdotal records, especially in cases where the 
test scores are at variance with a teacher’s opinion about a child’s personal 
and social adjustment. 

Suggestions for Evaluating the Newer Curriculum Practices—Sugges. 
tions for evaluating the newer curriculum include various rating scales of 


school practices or school procedures. A scale that may be highly recom. 


mended for evaluating school practices, or procedures, is entitled A Scale for 
Rating Elementary School Practices. It is distributed by the New York State 
Education Department. It includes a series of items for rating (1) teaching 
methods, (2) provision of materials, (3) atmosphere and environment of the 
class or school, and (4) relationships in the school. Another published scale is 
the Mort-Cornell Scale for Rating Elementary School Practices, which 
provides an inventory and rating of the various procedures, materials and 
methods in the elementary school. In addition to these rating scales, the School 
Practices Questionnaire,? by McCall and Loftus, may also be used. The 
application of any of these rating scales will provide evidence about school 
procedures so that the principal will be able to determine how closely the 
practices in his school conform with the types of practices defined in the scales, 
These rating scales provide an opportunity for critical self-evaluation by the 
principal and his teachers. 

Summary—Newer objectives in elementary schools have required the 
development of newer methods in evaluation. Steps in the process of evaluation 
are, first, to formulate a comprehensive range of curricular objectives which 
will include not only acquisition of information and facts but also evidence oi 
growth in interests, attitudes, work-study skills, critical thinking, and social 
behavior. A second step is to find available tests and techniques or to devise 
new formal and informal methods for appraising pupil growth in each objec 
tive. These newer methods are illustrated by tests of critical thinking, tests of 
personality, interest inventories, rating scales, and anecdotal records. A third 
step is to apply and to interpret the evidencé thus gathered about growth ol 
pupils. 

In order to interpret evaluation data most ‘wisely, the fragments of evi- 
dence collected about the pupil should be correlated and integrated into 4 
portrait of the individual by means of appropriate records and reports. The 
relationships among various aspects of pupil growth should be explicitly 
shown in the portrait. Only when these steps are carried through is it possible 
to realize a modern evaluation program. 


- 


® Winnetka Educational Press, Winnetka, Ill. 

10 California Test Bureau, Los Angeles, Calif. ‘ 

11 Bureau of Publications, Teachers College, Columbia University, New York, N. Y. 
12 Laidlaw Brothers, New York, N. Y 
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Forty Years of Educational Measurement 


An A ppraisal and a Prophecy 
S. A. Courtis 


Professor Emeritus of Education, University of Michigan, Ann Arbor, Michigan 


My first tests were given about forty years ago, and ever since I have de- 
voted the major portion of my professional activities to tests and measure- 
ments. I have always used, and am still using, tests in my teaching and experi- 
mental work. I have constructed tests myself, and between 1909, when my 
first tests were published, and 1938, when I withdrew all my tests from the 
market because I had discovered that they did not measure what they were 
supposed to measure, I sold more than twenty million copies around the world. 
Through surveys, membership in national societies, committees, and other such 
organizations, I have observed tests in use by teachers from New York to Cali- 
fornia. Perhaps it is needless to add, but the reader should know, that I am 
very greatly biased in favor of tests, and very greatly concerned that they be 
used properly. 

Now that my formal professional life is terminated by retirement, I am 
asked to share with principals and teachers my generalizations about tests and 
testing from my experiences. This I am very glad to do because, from my 
point of view, misconceptions about tests and testing are widely held, even 
among some of those in positions of authority and leadership. As a result many 
persons—teachers, especially—are confused and distressed by their experi- 
ences with testing. 

Tests are instruments which make possible the scientific study of educa- 
tional problems ; there can be no quantitative knowledge without measurement. 
The methodology of science has proved its worth in many fields. In all it has 
yielded basic truths and made objective predictfon, control, and transfer of 
power easily possible for all. Yet, in education, measurement has issued in few 
such results.1 The fault must lie in the tests themselves, in the way they are 
used, in the way the results are interpreted, or in all three. The methods of 
science are not on trial. 

One major defect in so-called educational “science” is that elements are 
assumed, not discovered. In chemistry, physics, and in all sciences in which 
investigations and experimentation have led to the discovery of law, the method 
has been to resolve complex experiences by objective analyses into component 
parts until something is eventually reached which defies further analysis. This 
something is then named. Oxygen, for instance, is the name of a form of matter 
which, within its own frame of reference, is different from every other sub- 
stance. It is unique, and its uniqueness was determined before it was named. 

Not so in education. We postulate intelligence and make a test for it before 
we know that there is such a thing as intelligence. Subsequent work has in- 
dicated that there is no element that can be called intelligence. What intel- 





1 Carrel, Alexis, Man, the Unknown (Harper & Brothers, 1935), I: 1-2. 
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ligence tests measure is merely an unanalyzed complex of behavior. Similarly 
we observe an activity called reading and make a test for it. But upon investiga. 
tion it turns out that there is no such thing as an elementary reading ability, 
The reading activity is an intricate, complex behavior as yet unanalyzed intp 
its elements. It is a very simple matter to show that no single test or scale yet 
devised for educational measurement yields information about anything more 
than trivial aspects of behavior ; that such things as spelling ability, insight, or 
creativity as observed and measured in education are no more elements than 
those postulated by the ancient alchemists—fire, water, earth and air—“ele. 
ments” which were derived by the same pseudo-scientific method, subjectiye 
postulation. 

As every thinking teacher soon discovers when she begins to use tests, the 
information derived from them is not very useful in terms of her concern for 
helping children grow as personalities, and only of minor value as a basis for 
planning a teaching program. The child who makes the lowest score in a read- 
ing test given to a particular class may really love to read, and may appreciate 
what he reads more than any other child in the class. Vice versa, the child 
whose score is highest in the same tests and class may not even have begun to 
understand what reading is all about. To accept at their face value scores in 
a reading test, or in any other test, works serious injustice to both teachers and 
children, magnifies the outmoded conception of education as the acquisition of 
academic knowledge and skill, and hinders the development of education as the 
process of releasing human talent and integrating personality. 

Educational literature today is largely a mass of superstitious beliefs, such 
as the constancy of the IQ, the existence of grade standards, the significance 
of marks, the value of teaching effort, and so on without énd. Like all super- 
stitions, each of these is based upon some element of truth.* For instance, 
under uniform conditions the IQ is usually constant during several successive 
years, but not necessarily sg, and never consistently so. Anyone ought to be 
able to see that the IQ, the ratio of mental age to chronological age, merely 
indicates the relative rates of development of two specific types of behavior. It 
is not a valid index of capacity except under conditions which are never ful- 
filled.* 

Does all this mean that tests are of no value and their total effect evil? By 
no means. A well-made test is an adequate instrument for describing precisely 
in objective terms aspects of behavior which are regarded as significant for 
certain purposes. It is only when one goes beyond the facts about behavior 
and draws unwarranted conclusions from statistical analyses based on assump- 
tions which are not valid that harm is done. 

One outstanding contribution of tests and measurement to education is, if 
my judgment, the revelation of the enormous differences in score which prevail 
within all subjects and grades, in all schools alike, whether public or private, in 


2 Courtis, Stuart A., “Facts and Fancy in Educational Measurement,” Bulletin of the School of Edw. 
cation, Indiana University (1942), 18: 8-24. ’ win : Ee 
3 These conditions are equally of experience (including training), inspiration, testing conditions, au 


all environmental influences. 
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this country or abroad. The range between the lowest and the highest scores 
in any specific grade group of twenty-five or more children is usually from six 
to twelve times the average yearly progress in grade averages. We who began 
to teach in the last century, before there were any tests, never realized that 
children differed so greatly in their achievements. We had no conception of the 
divergent tastes, talents, and developments which prevailed in our classes, and 
which still prevail after forty years of measurement. We supposed that if any 
child could learn, all the children in the class could be made to learn if they 
and the teacher would only try. 

However, this inference goes beyond the facts. It is true, and will always 
be true throughout the world, that when any test appropriate for the children 
tested is given, individuals will differ widely in raw scores because a test 
measures the resultant of all the factors operating—age, sex, capacity, experi- 
ence, training, conditions, motives, purposes, and a thousand and one other 
factors which cannot possibly be the same for all. 

When a class made a lower score than other classes in the same grade, it 
was similarly “natural” to assume that ‘‘poor teaching” was the cause. Careful 
experimentation, however, has shown that the more precisely the effects of 
teaching are determined under single variable controlled conditions, the less 
can any effect of teaching as such be discovered.* The school offers such an 
enormous excess of opportunities and does so little in assisting children to 
make use of them that all teachers, by and large, obtain nearly the same results.® 
Rate of maturation is a more influential factor than teaching.® 

Tests have made another very important contribution to education. They 
have helped focus teachers’ attention on the individual child. We still have 
mass education and prescribed curricula, although everyone admits today that 
children differ in almost every conceivable way, and that it would be unwise, 
even if we could, to make them all alike. To philosophy must go the credit for 
directing attention to the importance of the personality and social elements in 
education; the testing movement has merely reinforced the emphases of 
philosophy by showing that the subject matter aims and objectives of con- 
ventional teaching are not being achieved. A revolution in education is long 
overdue. When it comes, tests will be found to have prepared the way. 

The administrative and supervisory uses of tests have done both harm and 
good. Surveys to this day are based on the conception of education as the 
“acquisition of subject matter.’’ Most of the men and women who control in 
education have not outgrown the superstition that subject matter “results” are 
the most important considerations. To the degree subject matter tests have 
contributed to this deplorable situation, they have done harm. Certainly.tests 
have been, and are still likely to be, used to maintain in education that pattern 
of militaristic, autocratic, dictatorial control—inherited from the past—which 


*Courtis, Stuart A., ‘“Measurement ofethe Efficiency of Teaching,’”’ Educational Administration and 
Supervision, (September, 1932) 18:401-412. 
“Why Children Succeed (Friesema Bros. Printing Co., Detroit, 1923), p. 194. Also chap- 
ter 1X, Conclusion C, p. 201. 
“Maturation Units for the Measurement of Growth,’’ School and Society (November 16, 
1929), 30: 683-690. 
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leads to the domination of teachers by supervisors, principals, superintendents, 
school boards, and parents. 

Surveys, however, have had one supreme value. They have led everyone, 
including teachers and the public, to see education as a whole—as a continuing 
process. All have come to recognize more than formerly that a child is g 
maturing personality as he passes through school grades ; and today many are 
planning much more richly and widely in terms of personality values. 

Furthermore, measurement men have measured, in many fields, the subject 
matter of courses of study and determined its life value in terms of frequency 
of use and importance in functioning. Useless words, names, dates, and content 
have been eliminated and richer material put in their places. In Boston, in 
1845, a grade school child studied arithmetic two and a half hours a day. We 
have made some progress, but we still waste hours of time on non-essentials, 
Even today it is hard to find a place in the school day for character develop- 
ment activities, cooperation, the generation of friendships, and other such 
important human relationships. 

The educator, therefore, who understands the tests measure behavior and 
behavior only, and that the important elements in education are the develop- 
mental processes of maturation and integration, will use tests as a physician 
uses his thermometer. If a child is proved by tests to differ in behavior from 
other children, it merely indicates a favorable point at which to start an in- 
vestigation to determine causes. Whether the deviation is of good or bad 
import cannot be told in advance. The teacher today has a wide range of stand- 
ardized tools from which to choose in terms of her purposes, but to the degree 
she is unprepared to study her results as a scientist—impersonally, objectively, 
as a problem in research—she might better not use tests at all. No test yields 
results at all comparable in value with the generalizations of a sympathetic and 
understanding teacher in affectionate rapport with a growing child. 

What offthe future ? In my judgment tests have come to stay. Much as they 
are misused, they are of more value on any basis than conventional subject 
matter examinations. Teachers, left to themselves, will promptly discard after 
trial any device that does not help them with their children, and just as 
promptly will continue to use any device which furthers their purposes. More- 
over, today the laws of growth are known, and a new era of truly scientific 
‘measurement and research is dawning. This too is one of the contributions from 
the past use of tests. No teacher or administrator can afford to ignore tests, but 
tests in the hands of the ignorant or the evil-purposed are potentially as destruc- 
tive of personality values as atomic bombs. They are to be used with caution. 
However, for the research worker who is so trained in scientific methods 
that he is scientifically critical of his own work and conclusions, a new period 
of development lies ahead—an era of marvelous opportunities for making con- 
tributions to educational progress through the use of tests.? 





7 Courtis, Stuart A., “Next Steps in Educational Measurement,” Bulletin of the School of Education, 
University of Indiana (September, 1942) 18: 25-43. 





NEA Representative Assembly will meet in Buffalo, N. Y., July 2-4, 1946 
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ROBERT H. EDGAR 


The sudden death on December 15, 1945 of 
Robert H. Edgar, past-president of the Depart- 
ment of Elementary School Principals (1941-42) 
has removed from our midst a man who was be- 






































loved by his fellow-educators in all parts of the 
nation. A good citizen, an outstanding elementary 
school principal, and a genial, whole-souled gentle- 
man has gone from among us. 

Mr. Edgar was a graduate of Clarion Morwial 





School and Geneva College and took graduate 
a work at the University of Pittsburgh and Colum- 
bia University. At the time of his death he was 
principal of Bedford, Humboldt, and Esplen Schools, Pittsburgh, Pennsyl- 


vania. He was well known in the city of Pittsburgh for his active part in civic 





and professional affairs. 





pe In spite of a busy life in Pittsburgh, Mr. Edgar was enthusiastic for his 
ad profession and he gave his services unstintingly to national education affairs, 
id- making his influence and friendships spread from coast to coast. He served 
be: as President, Vice-President, and as a member of the Executive Committee 
¥ of the Department of Elementary School Principals of the National Education 
‘a Association. In 1944 he became a member of the Joint Committee on Safety 
Education, which prepared and published two Safety Bulletins under the 
ey sponsorship of the Department of Elementary School Principals and the Na- 
ect tional Commission on Safety Education. In November 1945 he met in Wash- 
ter ington with the Committee on the Principalship, of which he was a member, 
be and contributed his thoughts for the forthcoming booklet, The Elementary 
7 School Principalship—Factors in Planning. 
ym Members of the seven Annual Conferences on Elementary Education 
yut sponsored by this Department, will remember Mr. Edgar for the splendid 
AC service he rendered as the song leader at each of these two-week meetings. 
. At the Conference in Pittsburgh, as well as at the Department’s meeting during 
2 the session of the NEA Representative Assembly in 1944, he also served as 
* host to members and friends of the Department. There he again demonstrated 
his great capacity for friendship and leadership. He knew and liked people. 
om, We are thankful for the inspiration of his personality and of his life, and 
for the privilege of having worked with him. The memory of his good nature, 
friendly spirit, and helpful attitude will live in the hearts of his friends. 
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Maxwell Auditorium, Syracuse University, Syracuse, New York—where Conference will be held ~ 


Two. Weeks at Syracuse, New. York 


EIGHTH ANNUAL CONFERENCE ON ELEMENTARY 
EDUCATION 


July 8-19, 1946 


The 1946 Workshop in Elementary Education of the Department of § 
Elementary School Principals of the National Education Association is to be 
held at Syracuse University, Syracuse, New York. 

After a year’s delay due to war travel restrictions, our annual two-week 
conference is being planned to bring us up to date on education’s part in world 
events. It will be held July 8 to 19, 1946, immediately following the Represen- 
tative Assembly meeting of the National Education Association at Buffalo, 
New York. 

This is our first peace-time opportunity to renew acquaintanceships and 
to travel freely. It is our chance to regain perspective in our work by meeting 
together for a two-week period and finding out the changes and progress maée 
in the educational field. With the close of war and presence of peace we need” 
to re-examine our jobs. 
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The summer plans are closely adapted to these goals. The theme is 
“Strengthening World Organization—the Function of the Elementary School.” 
The knowledge that the world capital is to be in the United States carries a 

ave educational concern and a splendid opportunity to learn about UNO 
and UNESCO. America is beginning to accept more fully the responsibilities 
that accompany great wealth and strength by taking a larger share in world 
leadership. If these efforts are to be successful in the years ahead, our schools 
must lead the way. Citizens must be educated from early childhood to a broader 
understanding of relations between races, nations, religions, and cultures, At- 
titudes appropriate to an age of atomic energy and high-speed transportation 
are essential. The Workshop will provide an opportunity for a concentrated 
study gf some of the facts of our new era and of their implications for schools. 

The selection of Syracuse is also appropriate. Central New York normally 
has a pleasant summer climate. Within an afternoon drive are the famed 


Partial View of Syracuse University Campus 





| 
i 
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‘ Finger Lakes, the Adirondacks and the Thousand Islands. Numerous beautify 


lakes, streams and hills lie within a few minutes drive in all directions. Recreg- 
tional facilities of the University and of public parks and clubs are readily 
accessible. Local committees will plan tours and other means for combining 
recreation with education. 

All facilities of the University are to be available to the group. A dormitory 
is reserved for the use of the Workshop. Meals are planned as an integral part 
of the social program. A charge of $35 covers food and lodging for the two 
weeks. 

Maxwell Auditorium, especially designed for forum-type meetings, wil] 
be the scene of the general sessions. A section of the Education Library will 
be reserved for participants. : 

Credit—The student may earn two semester hours of credit for the two- 
week course. The number of the course as listed in the Syracuse University 
Bulletin is Elementary Education 145. 

Registration and Tuition—Members will register on Monday, July 8, 
between 9 and 9:30 a.m. at Maxwell Auditorium. A charge of $28 for the 
course includes $26 tuition and $2 for a copy of the Proceedings. 

The daily program will consist of a general session each morning and 
smaller seminar meetings three afternoons each week. The study activities of 
each person will center around the theme of the seminar he selects. Nationally 
recognized speakers and discussion leaders will participate. 

Reservations—To reserve accommodations in the Workshop send a $5 
check (made payable to Syracuse University) to Miss Eva Pinkston, 1201 
Sixteenth Street, N.W., Washington 6, D. C. If a person finds it impossible 
to attend the conference after he has made a reservation, his $5 will be refunded 
provided cancellation is made before June 1. 





APPLICATION BLANK 
1946 Workshop in Elementary Education 
Department of Elementary School Principals, N.E.A. 
DEPARTMENT OF ELEMENTARY SCHOOL PRINCIPALS 
1201 Sixteenth Street, N.W. 
Washington 6, D. C. 
GENTLEMEN: 


I wish to become a member of this program. Enclosed is $5.00*. Please make a reserva- 
tion for me. 





RR a a a rena ican scilocness nhc sagen 


IIS, Pies hci oa sachecrsaciicocdioscaisl 








GE a ROR TLE EET Ree RETR ee 





*The check should be made payable to Syracuse University. Mail application and check to Miss Eva ©. 
Pinkston. 
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Standard Achievement Tests 
and Classroom Examinations 


Gertrude Hildreth 
415 West 118th Street, New York, New York 


There are two kinds of tests commonly used in evaluating the changes that 
take place in children as a result of their school experiences. The first, and 
more traditional, is the teacher’s examination used to measure outcome in the 
skills and content subjects. These tests have consisted largely of essay-type 
questions based on the content of the course or the textbook studied, or a 
sampling of problems to be solved. Usually these tests have been wholly local 
and temporary in character, the questions being discarded as soon as the ex- 
amination is over. The other type of test is the standardized commercially 
distributed achievement test made up largely of objective-type, fixed-answer 
questions. These tests, unlike the teacher’s classroom examinations, are widely 
distributed throughout the country, and their form remains unaltered over a 
period of years, until the need for a revised edition is apparent. Since 1920 this 
second type of test has steadily increased in popularity until today millions of 
copies are used annually in every type of school from primary grades to college. 

At first many teachers and administrators continued to show preference 
for teacher-constructed classroom tests both on account of the expense of the 
new-type tests and because they did not understand the concept of standardiza- 
tion applied to tests; nor did they see how objective-type, brief answer items 
could be as effective as the essay-type question in determining the outcomes 
of instruction. 

Within the past few years teachers and administrators have reversed their 
skeptical and negative attitude toward standardized achievement tests to such 
an extent that in some schools they have swung over to commercial tests and 
the teacher’s examination has all but disappeared. It is not unusual to hear a 
teacher reply, when questioned as to a pupil’s skill in arithmetic or knowledge 
ina content field, ‘“‘I’ll be able to tell you after our standardized testing program 
next month.” Toward the end of the school year the principal requests that a 
complete survey be made of reading by means of standard tests from first grade 
through the eighth, disregarding any evidence the teacher might have collected 
through occasional class tests. If the present trend continues, the “blackboard”’ 
test may soon become extinct. 

Before discarding classroom tests entirely in favor of standardized achieve- 
ment testing, the administrator and teacher should set up an evaluation pro- 
gram in terms of the outcomes to be measured and then consider what sorts of 
tests will be most effective for the purpose. Both types of tests have certain 
advantages and limitations in the evaluation program and both have distinctive 
roles to play. 

Standardized Achievement Tests—The standardized achievement test has 
a number of distinctive features that can be illustrated in testing of reading, 
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spelling, or arithmetic. In the first place, the test has wide range which js 
achieved by including items distributed over a wide grade range, e.g., arithmetic 
problems ranging from third to eighth grade, spelling words varying in dif. 
ficulty from second to sixth grade, and reading items suitable for pupils jp 
grades four to eight. Such a range necessitates a much larger number of items 
than the teacher’s classroom examination called for. This wide range is not 
only a distinctive feature but one of great value for it indicates the true range 
of individual differences in a class or group, not solely the differences the 
pupils in the class show on a test covering a month’s study of fifth grade 
arithmetic, which would be the function of the teacher’s classroom examination, 

With such a wide range of items, objective-type questions and easily scored 
answers virtually become a necessity. The most popular form of short answer 
items is the so-called “multiple-choice” in which the pupil marks the one of 
several answers given that he believes is correct. Other types are: completion, 
true-false, and matching, descriptions of which can be obtained in any text- 
book on educational measurement. These items are selected and prepared with 


the greatest care by staffs of experts so that the greatest economy in giving and f 


scoring the test can be achieved. Tests published by reliable firms have estab- 


lished validity and reliability which contribute to the accuracy of results. § 


Usually several forms are available so that changes in the pupil’s achievement 
over a period of time can be reliably determined. 

Standardization gives to published tests an advantage which is lacking in 
the teacher’s informal classroom test that is not used in the exact form a second 
time. Through standardization, tests become analogous to universal currency, 
for the results obtained with children in our schools everywhere have a com- 
mon meaning. Standardization is achieved through applying the experimental 
forms of the test to children in the grades to which the test applies in represen- 
tative schools throughout the country. The scores on the tests are expressed 
as grade or age averages and these scores become recognized as the norms for 
the test. State-wide, city-wide, or school-wide norms can be obtained in the 
same way and for many purposes are even more useful for local purposes than 
making comparisons for a given class or pupil with country-wide norms. The 
use of norms is always a debatable issue in achievement testing because thes 


norms represent not desirable standards the teacher works to attain, but aver- f 


age accomplishment (the fifty per cent rating) for all the children to whom 


the test was originally applied. Slow learners will necessarily fall below the f 


country-wide norms through no fault of the child or his teacher. A gifted child 
on the other hand, may go “over the top” of the test with little effort on his 
part. The norms serve merely as useful reference points for determining the 
relative status of children in achievement compared with typical results througi 
the grades. Because of the wide-range features in these tests, a fifth-gradet 
may be discovered to fall below the typical achievement in reading of thiré- 
graders, a fact that the teacher’s classroom test in fifth grade reading might fall 
to discover. Standard tests are particularly effective for survey purposes whet 
the superintendent of a large school system wants to know the comparative 
standing of schools and classes in the basic subjects. 
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Standardized tests actually prove to be time-saving compared with class- 
room tests composed largely of essay-type questions, for it is conceivable that a 
teacher could spend an hour in scoring one pupil’s essay-type test in the upper 
grades, whereas the objective test could be scored in five minutes. The teacher 
is also saved the time that would be required to make out the test—a task that 
can be very time-consuming in the content fields of English and the social 
studies. 

Another advantage of the objective-type test is that the results can be 
interpreted in terms of standard scores, percentile scores, or quotients which 
have comparable meaning from test to test and from grade to grade. Thus, a 
pupil’s standard scores could be directly compared in half a dozen different 
tests such as arithmetic, reading, language, spelling, vocabulary, history, and 
geography, and from these scores a profile could be constructed which would 
show graphically the individual student’s strengths and weaknesses in all these 
areas of skill and knowledge. Furthermore, profiles for the same pupil could 
be constructed in successive years to show gains through the grades. Few 
teachers have the time or knowledge which would enable them to convert the 
results of classroom tests into comparable scores of the kind usually furnished 
with standardized tests. 

Advantages of Classroom Tests—The classroom tests constructed by the 
teacher have, in turn, certain advantages lacking in standard tests. No standard 
test is as good as a well-constructed classroom test for determining specifically 
what pupils have gained from a unit of study over a month or a school term. 
The questions for this sort of examining must be custom-tailored to fit-the 
study unit, and there is no standard test that meets this criterion, especially in 
these days when study units in the curriculum tend to extend beyond subject 
matter boundaries. Only the teacher who has taught the unit knows what to 
cover in the final examination. Brief tests can be given at short intervals. 

The applicability of standardized achievement testing in the content field— 
social studies, science, nature study, literature and the like—always has been 
a moot point, especially when the test is needed to test not general knowledge 
in these fields over a long period of time but specific knowledge gained as a 
result of planned school studies. Any short cut to this objective through gen- 
eral standardized achievement tests may be poor economy in the end. For 
placement in high school or college courses, for evaluation of selectees for the 
armed forces with irregular school background, the wide-range subject matter 
test in brief-answer form is indispensable. Programs for these individuals 
could not be intelligently planned without the data these tests provide, but in 
the elementary grades and for many purposes in high school the classroom 
examination should not be given up for “cover-all” objective achievement tests. 
The casual, informal, frequently given classroom test always will have a place 
in the school’s total program of evaluation. 

In planning a comprehensive program, it would be well for schools to move 
in two directions: first, to use the two types of tests in complementary fashion, 
not leaning to one extreme or the other, but recognizing the respective merits 
of both types and utilizing both for the distinctive purposes they serve best. 
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Second, the teacher’s classroom tests could be greatly improved by applying 
to their construction some of the principles that govern the construction of 
standard tests. More use can be made of objective-type, short-answer items: 
essay-type questions can be improved both in their construction and in scoring. 
The short-answer item in which the pupil is instructed to enumerate a number 
of points or to keep his written statements very brief should result in more 
reliable results and economy in scoring time. Instead of discarding all such 
examinations as soon as they are given, the teacher should file good test items 
on individual cards and preserve them for constructing new tests in the future. 
Mimeographing or hektographing the test sheets saves eye-strain in looking 
at questions written on the board and helps control the amount of space in 
which the pupil is to place his answers. Ingenious teachers will think of novel 
types of items that are interesting to pupils. Committees of teachers working 
together may construct their own school-wide or city-wide examinations to 
supplement and to check the results from standardized tests. 





The Importance of Acquiring Reading Skills 


William C. Krathwohl 


Institute fot Psychological Services, Illinois Institute of Technology, Chicago, Illinois 


The Army and Navy tests for specialized abilities during the last five years 
revealed, among other things, that the age in which we are living requires of 
us a knowledge of arithmetic, and that there was something wrong with the 
way it had been taught in many schools. They could have shown, just as clearly, 
that this age also demands of us the ability to read with understanding and 
speed, and that in this subject too there is something wrong with the instruc- 
tion our children receive. 

There is no question about the usefulness in daily life of reading skills. Not 
only do adults spend a great deal of time reading, but children acquire the 
inheritance of civilization mostly by this means. To cut in half the time re- 
quired to read books, magazines, or newspapers is to release valuable hours 
either for more reading or for other activities. On the other hand, to read 
books, magazines, or newspapers and not to know what has been read is to 
use up time which might better be put to other purposes. 

The question one might ask is, ““What evidence is there that individuals 
leave the grade schools with inadequate reading abilities?” One way to answer 
this question is to study the group who go to college. This group should have 
good reading ability. 

At the Illinois Institute of Technology all students who enter, whether they 
are freshmen or students with advanced standing, must take a battery of eight 
tests, lasting for three half days. Among these tests is the Cooperative Reading 
Comprehension Test, Higher Level. The range of scores on this test as meas- 
ured by national norms is quite surprising. For the last five years, this range 
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has run from scores on national norms as low as 1.5 standard deviations below 
the mean to scores as high as 4 standard deviations above the mean. 

A study was made of the relation between reading comprehension and 
success in those studies where an appreciable amount of reading must be done. 
The entire group who took the orientation examinations was divided into 
quarters, on the basis of scores on the Reading Comprehension Test ; and the 
grades received by students in the lowest quartile of the test were compared 
with those in the highest quartile. The results are given in Tables I and I]. 


TABLE I 
Unsatisfactory Grades 





Lowest Highest 








Course Year Frequency Quartile Quartile 
Chemistry Lectures I 42-43 426 66 cw | 
Physics Lectures I 43-44 414 72 26 
Physics Lectures II 43-44 278 55 30 
History I 43-44 492 39 12 
History II 43-44 $25 36 15 
Economics I 42-43 456 54 16 
English I 42-43 425 44 14 
English I 43-44 328 30 8 
English II 43-44 317 60 14 

TABLE II 


Honor Grades 





Lowest Highest 





Course Year Frequency Quartile Quartile 
Chemistry Lectures I 42-43 426 13 54 
Physics Lectures I 43-44 414 2 42 
Physics Lectures II 43-44 278 7 39 
History I 43-44 492 10 38 
History II 43-44 325 13 44 
Economics I 42-43 456 14 44 
English I 42-43 425 17 59 
English I 43-44 328 20 62 
English II 43-44 317 7 50 





The tables read as follows: 


In line one of Table I, under the heading “Course,” is Chemistry Lectures I. 
This represents the beginning chemistry course given during the first term 
and repeated in the second and third terms. Grades given for this course are 
independent of any proficiency in laboratory work and depend solely on knowl- 
edge acquired through lectures and reading. 
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In the second column, the numbers 42-43 mean that the classes studied met 
during the school year beginning July 1942 and ending June 1943. Under the 
accelerated program which was then in effect, this comprised three terms. 

In the third column, the number 426 gives the number of students in this 
study from among whom the lowest quartiles and highest quartiles were 
selected. The size of the numbers in this column precludes doubt of the results 
of this study on grounds of the smallness of the samples selected. 

In the fourth column, under “Lowest Quartile,” the number 66 means that 
of all the students in the lowest quarter of the Reading Comprehension Test, 
66 per cent received unsatisfactory grades in Chemistry Lectures I. Unsatis- 
factory grades are defined as those which are either failures or poor passes, 
Poor passes are not accepted for graduation unless the student either repeats 
the course with a satisfactory grade or obtains an honor grade in some other 
course with the same number of hours. 

In the fifth column, under “Highest Quartile,” the number 17 means that 
17 per cent of the students in the highest quarter of the Reading Comprehen- 
sion Test received unsatisfactory grades. 

A study of Table I shows the surprisingly large percentage of poor readers 
who do unsatisfactory work in these colleges courses as compared with the low 
percentage of those who receive honor grades. 

It may be argued that poor readers are frequently persons of low mental 
ability ; and while this statement is true, nevertheless this group was composed 
of students who were admitted to an engineering college where mental ability 
is known to be a prime requisite. Certainly students should be expected to be 
able to read to the level of their mental ability. Furthermore, the material which 
comprised the reading test was, for the most part, totally unrelated to that in 
such courses as physics and chemistry. Had there been an appreciable amount 
of such technical material, the results might have been interpreted as due to 
specific knowledge acquired by the individuals. 

Inspection of the last two columns shows, in every case, that many more 
poor readers did unsatisfactory work than good readers. The ratio of per- 
centage inthe lowest quartile to that in the highest quartile for beginning 
courses is sometimes as high as 4 to 1. 

Table II is a study of the students who made honor grades. These are 
defined to be the two grades A and B above the grade of C, which is a satis- 
factory passing grade. Here the results are completely reversed with, in every 
case, ratios of percentages in the upper quartile to those in the lower quartile 
well over 3 to 1. The student with good reading skills not only does better 
work but probably has more time to do more and better work. These ratios 
would undoubtedly have been even higher if it were not for remedial reading 
instruction which is given to every freshman who scores in the lower half on 
the Reading Comprehension Test. 

Freshmen in colleges, and particularly in engineering schools, find that the 
pace set in colleges is so much more strenuous than that in the lower grades 
that they have difficulty in finding enough time for their home work. This is 
particularly true when they have to cover long reading assignments in the 
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humanities, while doing intensive preparation in the natural and physical 
sciences. A four-hour reading assignment is accomplished in two hours, if the 
speed of reading can be doubled. Much of the difficulty confronting the students 
is due to poor reading habits; and much of this difficulty could have been 
eliminated in the elementary grades, when their minds were more receptive to 
acquiring skills and techniques. Unfortunately, little children do not think they 
are reading unless they are doing oral reading. This is readily understandable, 
and there is no question but that in the lower grades considerable oral reading 
should be done. However, some of the slow reading of college students can 
certainly be traced to too much oral reading in the lower grades. It is much 
easier to instill good reading habits in children than. it is for the instructor 
in high school and college to have the additional task of correcting poor read- 
ing habits acquired in the elementary school. Experience with our freshmen 
has shown that to try to correct reading deficiencies of college students who 
have had bad reading habits for years is an extraordinarily difficult task. 

Testing a child’s progress in reading should be considered just as important 
a part of the child’s education as testing his progress in arithmetic or spelling. 
The emphasis should be on silent reading comprehension and speed instead of 
on oral reading progress. Every grade school teacher should have special train- 
ing in the teaching of reading. If she is at all competent, she can make up her 
own daily or weekly tests to be followed periodically by reliable standard com- 
mercial tests as a check. By these means the poor readers can readily be recog- 
nized and proper remedial training can be given early, before they establish 
poor and defective reading patterns. 

Since the proportion of students who go to college has, under normal con- 
ditions, been increasing every year, it becomes more and more important that 
good reading habits be acquired at that time in young people’s lives when their 
minds are receptive to acquiring skills: at-the grade school level. 





Buffalo Meeting 
July 1-4, 1946 


The Twenty-Fifth Meeting of the Representative Assembly of the National 
Education Association will be held at Buffalo, New York, July 2, 3, and 4, 
1946, with Dr. F. L. Schlagle, President of the National Education Associa- 
tion, and Superintendent of Schools, Kansas City, Kansas, presiding. 

President Schlagle has invited the Departments of the National Education 
Association to hold their meetings on Monday, July 1—the day preceding the 
opening of the Representative Assembly. Lester J. Nielson, President of the 
Department of Elementary School Principals, and Principal, Woodrow Wilson 
School, Salt Lake City, Utah, is making plans for the Department to hold a 
breakfast, an afternoon session, and a dinner on that day. Complete announce- 
ment of the program will appear in the April issue of THE NATIONAL 
ELEMENTARY PRINCIPAL. 
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Fallibility of the IQ 


Anna M. Shotwell 
Senior Clinical Psychologist, Pacific Colony, Spadra, California 


Jimmy was entered in school at eight years of age but was excluded because 
he “could not learn academically.” On the basis of a Stanford-Binet intelligence 
test given at the age of 9-7, his IQ was reported to be 26. No other test result js 
recorded for him until three years later, when at the age of 12 he was given a 
performance test, on which his IQ was 78. An IQ on a second Binet test given about 
this time was within three points (23) of the first-reported IQ. However, on a 
test of mechanical skill and understanding his percentile rating was such as to 
indicate superior mechanical ability. 

Present tests of Jimmy at 22 years still show him to be an “imbecile” if judged 
by a Standard-Binet test; but performance tests of intelligence and tests of mechan- 
ical ability agree in showing a high degree of skill and superiority when dealing 
with concrete materials, form perceptions, and spatial relations. He can attend well 
and grasp the meaning of much of what is said to him, but he has almost no speech 
and makes himself understood largely by gesticulations. Today he is in an instity- 
tion for the feeble-minded, partly on the basis of his Binet record but also because 
his home is unfit and he had inadequate supervision. His father is dead and his 
mother is reported to be cruel, careless, and immoral, being described as one “who 
frequents beer parlors and entertains men in the presence of her children.” 

In the institution Jimmy spends his leisure time constructing large-sized model 
airplanes which are made of numerous pieces and are of intricate design. They are so 
well made that they are sought by various civic-minded persons and organizations 
and are displayed in prominent places. During working hours Jimmy makes laundry 
bags by machine sewing and makes and repairs mattresses. The man under whom 
he works says that he works as rapidly and as well as regular employees. Jimmy 
previously worked in the institution shoe shop, where he showed the same excep- 
tional manual ability. 


The case of Jimmy, who is aphasic, is extreme but is illustrative of those 
children in whom there is a marked discrepancy between verbal and non-verbal 
abilities. While it happens that in many—in fact, in most—individuals verbal 
and non-verbal abilities are developed to about the same level, since “In both 
performance and verbal abilities we have something in common” (2) it also 
happens that in other individuals verbal and non-verbal abilities are very 
unequally developed. Intelligence is neither exclusively verbal nor exclusively 
non-verbal ; hence, tests of verbal ability aJone or of performance ability alone 
give only an incomplete picture of a child’s general learning ability. Where 
discrepancies occur between a largely-verbal test, like the Stanford-Binet, and 
a strictly non-verbal test, like the Arthur Performance Scale, the child’s true 
ability can not be described by the IQ obtained on either test, although both 
IQ’s may be helpful in understanding the child’s mental makeup. Indeed, n0 
one term nor any one IQ can adequately describe mental ability in cases of 
this kind.* 


1 Fortunately, we have in the Wechsler-Bellevue Scale a mental test made up of an equal number of 
performance and of verbal tests, being constructed on the hypothesis that “an individual manifests his 
intelligence by his ability to do things as well as by the way he can talk about them.” Unfortunately, 
this test is best adapted for ages above sixteen although there are norms for children as young as te 
years. 


— 
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Intelligence quotients on highly verbalized tests are not only inadequate 
in cases of aphasia and in those in whom verbal and non-verbal abilities are 
unevenly developed, but they may yield very unreliable ratings in cases of 
speech or hearing defect and of foreign language handicap. The acquisition of 
a second language is especially likely to result in confusion of thought and ex- 
pression which is reflected in a lower Binet IQ than is representative of the 
individual’s mental ability. The writer recently completed a study which com- 
pared American-white youths with English-speaking, American-born Mexicans 
of comparable age and Binet IQ as to their rating on the Arthur Performance 
Scale. Whereas the 80 American subjects earned a performance IQ that aver- 
aged 5 points higher than their Binet IQ, the 80 Mexicans averaged 22 points 
higher. That the difference between the means of the two groups was sig- 
nificant was indicated by the size of the Critical Ratio, which was 5. 


The intelligence quotient can never be taken entirely at its face value. As 
a measuring instrument it is far less accurate than the yardstick, for example, 
and that which it measures is far less tangible than that which the yardstick 
measures, e.g., height. The Probable Error (PE) of an IQ on the revised 
‘ Stanford-Binet,.for all age levels combined, varies from approximately 1.5 
points for low 1Q’s to approximately 3.5 points for high IQ’s. Thus, when a 
Stanford-Binet IO of 141 is reported for a child, the most that can be said of 
the child is that his IO on the Stanford-Binet lies between 137.5 and 144.5. 
The need for bearing in mind the error of an IQ rating is apparent when com- 
mitment to an institution for the feeble-minded depends largely upon the out- 
come of a test. Some months ago a newly-admitted patient said to the writer, 
“My IQ is 69; the judge said if I’d made one point higher I wouldn’t have 
had to come here”! The IQ was never intended to be interpreted in so literal 
a way. 

An often-overlooked fact which can lead to grave error in particular in- 
stances is that identical IQ’s on different tests may represent different orders 
of intelligence. Table 1 is presented to show the interpretation which three test 
authors put upon IQ’s obtained from their tests. Although only the IQ’s in the 
levels of average and below are included in the table, similar variation obtains 
in the above-average levels. 


TABLE I 
Intelligence Classification According to IQ (for Levels of Average and Below) 





Classification Terman Kuhlmann Wechsler 





Definite feeble-mindedness, 


Mentally defective, Defective Below 70 Below 75 65 and below 
Borderline, Borderline deficiency 70-80 75-84 66-79 
Dull, Dull normal 80-90 85-94 80-90 
Average 90-110 95-104 91-110 
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One of the reasons why IQ’s from different tests are not always comparable 
is that the tests have been standardized on different populations. One author. 
when establishing norms, may have excluded all cases below or above a certain 
age; another may have excluded all foreign-born children; another may haye 
excluded all markedly subnormal or unstable cases ; and still another may have 
excluded none or only a percentage of special of atypical cases. Great injustice 
may be done to American Indians and Negroes as well as to foreign-born 
persons when they are assigned an IQ on the basis of an intelligence test 
standardized only on American-whites. 

Not the least limitation of the IQ is that it can describe only one aspect or 
phase of a child’s development—chiefly the ability to learn what is taught in 
the classroom, which, important as it is, is by no means the most important 
phase of living. The child is not only an intellectual being ; he is also a physical, 
a social, and an emotional being. His happiness and his effectiveness as a 
citizen depend much more on his physical state of being and on his social and 
emotional adjustment than on his ability to absorb, retain, manipulate, and 
expel facts. He may be admired because he is a walking dictionary, but he will 
not necessarily be liked because of that fact. The IQ is helpful in determining 
the intellectual status—the cold, colorless, factual side of life, if you will—but 
it is woefully wanting as a means of determining social adequacy and emotional 
development, the latter representing the warm, colorful, feeling side of life and 
being of prime importance in daily living. Character traits like docility, aggres- 
siveness, and inferiority, which contribute greatly to a child’s personal and 
social adjustment, are not measured by the IQ although they may be observable 
to the examiner who administers a test. Of two children of the same age and 
same Binet IQ, one may be well-adjusted and the other extremely maladjusted, 
but something other than the IQ is needed to determine the facts about their 
adjustment. 

Indeed, personality traits can and do operate to affect the IO rating. The 
child who at-all times is cautious, deliberative, cooperative, and desirous of 
putting forth his best effort will likely earn an IQ that is higher than that of 
two children of similar mental capabilities, one of whom shows traits of impul- 
siveness, impatience, and inattention and the other of whom is extremely shy 
and withdrawn and so fearful of making a mistake that he fails to do all he is 
capable of doing on a test. 

If we bear in mind the limitations of the IQ and not make of it a master, 
it can become our useful slave. If a six-year-old child, upon admission to the 
first grade, attains an IQ of 150 on the first administration of a [inet-type 
test and of 165 on a performance test,-one can feel sure that such a child has 
very exceptional learning abilfty and a program can be mapped out which will 
call into play his intellectual capabilities while not neglecting his physical, 
social, and emotional well-being. In cases like this, teachers will be able to 
recognize without the aid of a test score that they are in the presence of am 
exceptionally bright child; an IQ rating, however, will not only enable them 
to become more quickly aware of the mental caliber of a child but will give 
them a more definite idea of his degree of brightness. A nursery school teacher 
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once said to the psychologist who examined her group of children, “You tell 
me the first week of school what I can find out for myself after a year.” By 
knowing early in the year the approximate scholastic capabilities of her pupils, 
ive this teacher was assisted in setting learning standards that, from the first days 
we — of school, were neither too high nor too low to care for the individuals under 
ice her tutelage. 

mn High IQ’s obtained on the first administration of a carefully constructed 
est § and well-standardized test are likely to be reliable indicators of outstanding 
mental ability for the simple reasor*that if the testees had not been able to score 
ot § high they would not have done so on a first presentation of new material. Low 
in § 1Q’s, while they may be too low for reasons already stated, may be confirmed or 
ant disproved by subsequent administrations of the same and different tests and 
‘al, § if found to be reliable, i.e., relatively unchanged, they enable teachers to plan 
$a § programs which will make sufficient but not undue demands upon the learn- 
nd ing capacities represented. 

nd The controversy between the “hereditarians” on the one hand and the 
vill § “environmentalists” on the other as to whether the IQ is a measure mainly of 
ng § native endowment or of environmental influences working upon native endow- 
ut § ment does not render the IO a valueless concept to those in the teaching 
nal § profession. It matters little to teachers whether the IQ is an indicator of what 
nd § one can do or of what one does do. As opposed to the hereditarians who would 











































es- say regarding the education of a child with an IQ rating of, say 85, “You can’t 
nd § make a silk purse out of a sow’s ear,” and also, perhaps, as opposed to Stoddard 
ble (3), representing the environmentalists, who would say, “We can make a silk 
nd purse out of what we thought was a sow’s ear,” the teacher’s aim, judged by 
ed, her past attitude and performance, will likely be—as the writer believes it 


eit should be: “I'll go as far as I can toward making a silk purse out of what 
seems to be a sow’s ear.” 
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an MANY people have the ambition to succeed; they may even have special aptitude 
m for their job. And yet they do not move ahead. Why? Perhaps they think that 
a“ since they can master the job, there is no need to master themselves.—John Stevenson 
er 
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Choosing a Suitable Teacher-Made Test 


W. A. Saucier 


Department of Education, Baker University, Baldwin, Kansas 


It is a truism that no instrument, tool or machine can be evaluated withoy 


‘reference to its function. For instance, to decide whether to choose a truck or 


an automobile may depend on whether the individual intends to use the vehicle 
to transport a large quantity of rock or a few people. This basic principle has 


not always been recognized in theory and practice in measurement. Specifically, 


several writers on tests compare the essay, or discussion, examination with 
the new-type examination, or objective test, with little or no consideration of 
the aims, or goals, of education. Following these theorists, many elementary 
school teachers disregard the ends of education as they choose a particular test 
merely because it can be scored easily, the pupils like it, or nearly all other 
teachers use it freely. Hence an intelligent approach to our problem is throug 
a reconsideration of educational objectives. 

The objectives of education that are based on modern psychology and the 
democratic philosophy emphasize development in insight, comprehension, 
thinking, doing, performing, constructing, composing, wholesome individual 
interests, intelligent cooperation, and democratic, or scientific, attitudes. Such 
points of emphasis require a teaching procedure that makes extensive use of 
problems and purposeful activities, or projects. In this procedure, the pupil 
learns facts and skills in relation to one another and the several goals toward 
which he moves as he engages in the various real and meaningful activities, 

To discover advancement in the type of learning just described, the teacher 
may well rely chiefly on incidental, or informal, means. As the pupil shares 
mentally and emotionally in planning the activity, engages in it, and finally 
participates in evaluating it, the teacher has several opportunities to observe 
the pupil’s reactions. Thus the teacher can discover the pupil’s actual use of 
information and skills and his real attitudes in lifelike situations. For example, 
the best check on a pupil’s progress in English composition is not his isolated 
knowledge of the elements of the subject, but how he writes in genuine sell- 
expression outside of class as well as within it. Likewise, the real interest of a 
pupil in literature can be discovered more reliably by his habits of reading than 


by any spetific questions on reading given to him in a pencil-and-paper test. | 


However, there is at least a supplementary use for some kind of test, of 
examination. As already mentioned, the kind of test that is chosen should be 
determined by its function. If the primary purpose of the test is to encourage 
and reveal atomistic learning, consisting principally of the memorization of 
specific facts, an examination containing many samplings of these small bits of 
learning ought to be chosen. Such an examination is the objective test, or s 
called new-type examination.! On the other hand, if the chief reason for giving 

1 This examination is new only in form. Our forefathers were asked: Who was the first president of 


the United States? We call this old-type question a new- ~ 9 one if it is changed to read: The first 
president of the United States was 








38 THE NATIONAL ELEMENTARY 








the ex 
logical 
requir 
quiren 

Ar 
the ch 
to exa 
suppo: 
is sup] 
partict 
they Ic 
valid t 
tors be 
cratic, 
pressic 
acquis: 
measu! 
than t 
examit 
the ex: 

Sor 
Educa 
tive te 
and ur 
weakn 
though 
his ow 

An 
reliable 
the ex: 
aminat 
examit 
result, 
other | 
gave t 
discus: 

In 
it is pr 
studies 
tunatel 
to clus 
tenden 
of the | 


2A c 
Chicago. 
3 See 


ns 


PRIN 








the examination is to discover and promote ability in extensive discussion, 
logical self-expression, problem solving, and thinking, an examination that 
requires an exhibition of these abilities should be selected. Obviously, this re- 
quirement is met by the discussion, or essay, examination. 

Among the criteria for the evaluation of these two kinds of examinations, 
the chief one is validity. On this theorists seem to agree. Validity as applied 

















































= to examinations means that the examination is valid if it measures what it is 
i, supposed to measure. Unfortunately, many writers seem to believe that a test 
hes is supposed to measure only the acquisition of many bits of information ina 
ly particular area of subject matter. Holding this restricted view of education, 
vith they logically conclude that the objective test, or new-type examination, is more 
sal valid than the discussion, or essay, examination. On the contrary, some educa- 
ary tors believe that education consists of the development of scientific, or demo- 
test cratic, attitudes, integration in learning, broad comprehension, creative ex- 
her | Pression, and scientific habits of thinking, all of which would include the 
uh acquisition of facts. These educators are consistent as they insist that, for 
measuring these ends in education, the discussion examination is more valid 
the @ than the objective test. Thus the comparative validity of the two kinds of 
ion, § examinations can be determined only in relation to the purpose, or function, of 
ful the examination. Aa . . 
A Some educators, even the ‘Commission on Evaluation of the Progressive 
Be Education Association in the famous Fight-year Study, have claimed that objec- 
api tive tests for measuring aspects of thinking have been constructed. A careful 
ie and unbiased study of such tests, however, reveals that they have the inherent 
eg weakness of all objective tests that require merely checking the ideas, or 
Me thoughts, of the constructor of the test. They do not require the pupil to express 
sres IS own thoughts.* ; Gar er 
ally Another important factor in an examination is its reliability.. A test is 
sais reliable if it is accurate or dependable. There is consensus among writers on 
a the examination that the objective test is more reliable than the discussion ex- 


ple amination. This is to be expected. If an examination is really objective the 
ed examination itself determines the grade, not the teacher’s judgment. As a 
result, a grader of a paper will discover the same score for a paper as will any 


elf : 
‘ other grader. Likewise a grader will give the same grade to a paper that he 





of a ; ; : aie : 
han P 82Ve to it a month before. Nevertheless, there is no scientific evidence that the 
est discussion examination is very unreliable. 


ae In an effort to prove that the discussion examination is so unreliable that 
he it is practically useless, many writers have monotonously referred to famous 
studies of the range of grades given by several graders to one paper. Unfor- 


my tunately these writers do not also point out that in these studies the grades tend 
a to cluster rather closely at some point.* As all statisticians know, the central 
f tendency, as well as the range, is significant in the interpretation of the results 
ing of the study. The failure of these writers to refer at all to the clustering of these 


——— 
, "A complete set of these tests can be purchased from the School of Education of the University of 
nt 0! cago. 
first 5 See recent books on educational measurement. 
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grades indicates that they are unscientific in their attack on the reliability of 
the discussion examination. 

These same theorists claim that the discussion examination is unreliable 
because it samples learning inadequately. In making this charge they have ip 
mind a sampling of merely a field of knowledge, or a body of subject matter. 
They do not include adequate sampling of desirable attitudes and habits of 
study and thinking, ability to relate ideas, and logical self-expression. Obyi- 
ously, su~h important outcomes of a democratic program of education can be 
sampled much more effectively by the discuSsion examination than by the 
objective test. Thus, if the objectives of education are not restricted to learning 
bits of subject matter, it is the objective test that must be charged with inade- 
quate sampling. 

Our conclusion is that the theorists have not proved that the discussion, or 
essay, examination is decidedly unreliable. Moreover, since the teacher should 
give several discussion examinations as well as papers during the semester, an 
average of all the grades should yield a rather reliable final one. This is not to 
mention the additional use of oral class discussion and various activities. 

The third criterion of a satisfactory test, as given by practically all writers, 
is usability or practicality. This includes such factors as cost and ease of ad- 
ministration. Neither the discussion examination nor the ordinary objective 
test involves any appreciable cost. However, certain kinds of objective tests— 
for example, some of those constructed by the Commission on Evaluation of 
the Progressive Education Association—are clearly impractical. They require 
an unusually large amount of mimeographed material to obtain only a small 
sampling of the pupil’s learning. With reference to ease of administration, more 
time is required to administer the discussion examination than the objective 
test. Yet, since the discussion examination is superior to the objective test in 
promoting and discovering the most important outcomes of education, the 
pupil spends his time well in taking this examination. As to the saving of any 
time by the teacher, especially outside of class, this is of little consequence in 
comparison with the welfare of the pupils. 

So far we have implied that one unavoidable result from the use of either 
examination is the development of some kind of study habits. This important 
point requires special consideration. Carefully made investigations of the 
study habits of students preparing for the objective test show that they look 
primarily for minute details and memorize numerous bits of information; 
whereas, when they prepare for the discussion examination they tend to search 
for broad concepts, to relate ideas, and to draw conclusions. This, of course, 
suggests that in choosing a test, the teacher should consider the kind of study 
habits that he desires to develop in his pupils. 

The preceding discussion implies the necessity of preparing pupils for the 
kind of examination they will take. Unfortunately, teachers customarily teach 
only for the objective test. In theory and practice it has commonly been as- 
sumed that it is impossible for elementary school pupils to organize theif 
thoughts logically and to engage in extensive and creative self-expression, a 
is required in an effective discussion examination. Such abilities can be de 
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veloped in elementary school pupils provided the teachers in all the grades 
spend much time in directing pupils in writing, as they have occasion to report 
on various activities and to present significant facts, generalizations, or con- 
clusions in the study of a unit. 

In the construction of the essay examination the teacher should endeavor to 
make each question require broad comprehension and extensive discussion. 
This means that ten or more questions are too many for an hour. The pupil 
must have time to think and to express himself logically. Sometimes only one 
question, stated as a topic of an essay, would be suitable. Furthermore, the 
teacher should avoid using the type of question introduced by “discuss.” In- 
stead, he ought to give a big problem for the pupils to solve and insist that they 
attack and stick to the problem. Such questions would demand making reports, 
evaluating activities, making comparisons, expressing relationships, drawing 
and supporting conclusions, and the like. It should go without saying that the 
questions would require the pupil to reorganize and express his own thoughts, 
not just to write ready-made discussions from the teacher or the text. 

The papers should be graded not only for information but also for the use 
of it. The teacher needs to realize that a fact is useless in itself. Accordingly 
the pupils should be expected to show both breadth and application of knowl- 
edge, or thinking. The pupil may have acquired merely an aggregation of 
numerous facts but not be able to use them in thinking. If he really thinks, 
however, he must have a mass of facts and know the ones that are pertinent to 
the solution of the problem. Therefore, if the teacher concentrates on the goal 
of thinking in grading as well as in constructing the examination, he will not 
neglect to determine the acquisition of facts. This means that in grading papers 
he will consider composition as being an essential factor. 

As has already been indicated, the results from the essay examination 
should be used principally to discover and to advance effective learning for 
democratic living. To this end it seems wise to write comments on papers but 
no grades. Grades tend to direct the attention of the pupils away from criticisms, 
explanations, and directions. Of course, if the school still gives grades, the 
teacher should record them in his classbook. In the individual conferences that 
may well follow the examination, the teacher can tell each pupil what his grade 
is. The chief purpose of this conference, however, should be to give further 
assistance to the pupil toward seeing his present progress and to contribute 
inspiration toward his future progress. 





M eeting of Editorial Committee 


Members of the Editorial Committee of the Department of Elementary 
School Principals will meet in Harrisburg, Pennsylvania, February 18, 19, and 
20 to complete plans for the 1946 Yearbook on “World Good Will,” and to 
further arrangements for future Yearbooks. 


PRINCIPAL, FEBRUARY, 1946 





md 





















Suggestions Regarding a Testing Program 


Ernest E. Bayles 
School of Education, University of Kansas, Lawrence, Kansas 


There seems to be a rather widespread feeling today among writers and 
speakers on educational matters that classroom examinations, particularly of 
the pencil-and-paper variety, should be eliminated. Although we recognize that 
past and even present classroom testing programs have fallen far short of what 
they should be, we are not convinced that we can get along satisfactorily with- 
out formal pencil-and-paper tests even in elementary schools. 

Without knowledge of progress toward goal there will be little, if any, 
progress. This principle enjoys practically universal acceptance by psychologists 
regardless of the “school” to which they may belong. Specific information as 
to what a particular effort accomplishes is necessary in order to know how to 
modify it and thereby improve. Mere desire to improve is not enough. Informa- 
tion regarding both successes and failure is a sine qua non of progress. 

We recognize the weight of the contention that if classes are taught reflec- 
tively, as we would want them to be, testing is an integral part of the process. 
Reflective thought has been defined as “a process of finding and testing mean- 
ings,” and therefore genuinely reflective teaching would seem to make it quite 
unnecessary to administer periodic check-ups of pupil progress. The participa- 
tors in a genuine research project seldom need to be quizzed over the project 
in order to insure retention of what has been discovered. 

But even those who are moving back the frontiers of knowledge make 
careful notes as they proceed and use them frequently under circumstances 
which are very much like classroom testing as we would desire it. Moreover, 
even the most ably conducted classes cannot hope to duplicate the conditions 
of a genuine research project, though they may strive toward that form as a 
ideal. 

It is the writer’s experience that periodic written “quizzes” of a proper 
nature markedly improve the quality of classroom instruction even on advanced 
doctoral levels, wherein it would seem that such testing would be least neces- 
sary. Even with students as mature and advanced as these, a formal test in 
which opportunity is given for expression of ideas and for critical reaction by 
an able examiner makes for organization and refinement of concepts that seem 
difficult to achieve as effectively in any other practical way. 

What is the nature of a testing program which we would consider genuinely 
educational? In order to answer this question, we should first state our educa- 
tional purpose, namely, to develop a pupil's independent learning ability while 
enhancing and harmonizing his learnings. This represents a two-fold purpose: 
first, teaching a youngster how to think, and second, increasing his fund of 
insights while bringing them (the new as well as the old) into greater agree- 
ment one with another. 

We are not greatly concerned about how many or how few particularized 
items of information a youngster can pile up in memory. If some circumstance 
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should arise to make a memory-level test desirable, such as to satisfy the 
curiosity of a principal or possibly to cooperate with the members of a survey 
committee, we would employ the best objective, informational test over the 
field that we could find, explain to the pupils why we were giving it, and - 
administer the test. 

But for our own purposes such a test would be quite unnecessary because 
we would wish to know how youngsters can use information in solving 
challenging problems, rather than how much they carry around in their heads. 
On ordinary tests we would make available to pupils the basic informational 
material which they may need. For advanced classes we would permit access 
to notebooks or textbooks as desired, and for elementary classes the necessary 
factual information would be made available by means of blackboard, charts, 
maps, or whatever is suitable. 

Ability to use information means understanding. Items which make up a 
memory-level test have to be replicas of previous experience. But items which 
represent a test for understanding must be novel to the person being tested, so 
that such person will have opportunity to apply what he knows rather than 
merely to restate it. 

For example, to ask for the names of all forty-eight states in the Union, 
together with the capital of each, is essentially a memory-level test item be- 
cause, unless a map or other source of information is available, the question 
cannot be answered except from memory. On the other hand, with a map 
available for each youngster, one might ask whether state capitals tend to be 
located at or near the geographical centers of their respective states. Assuming 
that this question has never occurred to the examinees before, a correct answer 
can very well be taken to indicate understanding of certain features of maps, 
such as how the boundaries of states are represented and how state capitals 
are designated. 

Understanding-level tests, like memory-level, can be either essay or objec- 
tive. Contrary to somewhat common belief, objective test items are not always 
memory-level and essay test items are a long way from always being under- 
standing-level. Novelty of an item for the one being tested is the real crux of 
the matter. Any experience which for a given individual represents a new use 
of an old insight is a test of understanding, regardless of its form. 

We have said that we would aim our testing at understanding level rather 
than memory. And if achievement of certain specific and specifiable under- 
standings were our sole and only aim in education, then our testing program 
could well be composed solely of pencil-and-paper tests, objective in nature, 
which give repeated opportunities for applying the basic understandings that 
the instruction is designed to achieve. But we have said that in addition we wish 
instruction to promote independent learning ability on the part of pupils, and 
that their understandings shall not only be increased both in number and in 
scope, but that they shall also become more harmonic. Thus, we would need 
a testing program which measures not only understanding but also harmony 
of understandings and independent learning ability (or the ability to think 
teflectively ) . 
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The problem of testing a pupil’s ability to think reflectively has been tackled 
by several investigators and test makers within the past ten or fifteen years 
So far efforts at making tests for the reflective ability have followed the patterp 


* of dividing the process of reflective thought into its assumed constitutive ele. 


ments, design an objective test for each element, and then arrive at a final score 
which is the sum of the separate scores over the various elements. 

For instance, it is assumed that the “ability to draw inferences from facts” 
is one element of the ability to think reflectively. Therefore a test item such 
as the following is devised: 


In an experiment some white starch was treated with brown iodine solution. 
This was done ten times and each time a blue color was formed. 

Later some white starch was mixed with saliva. The mixture was left for a 
time and then it was treated with brown iodine solution. This was done ten 
times and each time no blue color was formed. 


a. The starch was changed to sugar by the action of saliva.................... ( )a 
eg tle ERIS D5 2 De cel Le ASR ( )b 
Teme GC wees IN Hci access erecscens ( )e 
d. Saliva produced a change in the starch.................eeeeeceececeeceeeeeeeeeeeees ( )d 
e. Starch mixed with iodine solution did not turn blue......0.000000000....... ( )e 


For each of the statements lettered a, b, c, d, and e, the pupil is to mark 
whether the statement is a reasonable interpretation of the result obtained, 
whether the facts are insufficient to justify the interpretation, or whether the 
statement cannot be true because the results contradict it. 

Other elements tested for include: ability to discover and define problems, 
ability to observe phenomena accurately, and ability to select facts relevant to 
a problem. 

Tests of this kind, we are satisfied, do indeed have value in indicating with 
considerable accuracy the degree of reflective ability a person possesses. The 
results seem to correlate rather highly with the results of intelligence examina- 
tions, and it seems reasonable to assume that intelligence and reflective ability 
should correlate highly. But, as generally usable classroom tests, we feel that 
something is still to be desired. 

First, they are rather unwieldly, although not seriously so, since they gen- 
erally require two 50-minute class periods for giving. Second, they are not 
available in published form except over generalized subject matter content, 
whereas a teacher would usually want the tést to cover the particular material 
of his own instruction, and such tests are too complicated for busy teachers to 
dash off between classes or at the close of a hard day’s work. 

Third, and perhaps most important, is the fact that such tests are much 
more complicated than they need to be because of the atomistic theory upon 
which they are based. Configurational psychology asserts that a whole is more 
than the sum of its parts, and that a part when separated from a whole is 00 
longer a part of that whole but a whole in itself. Actually, an item like the one 
reproduced above is, to a degree at least, a problem in itself, and its proper 
handling represents essentially the total process of reflection rather than @ 
specifiable portion or element of it. Therefore, a test composed of dozens of 
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such opportunities, requiring two hours of time, should give a pupil ample 
opportunity to demonstrate his reflective ability. But it does so, not because 
each part tests a different “element” of thinking, but because each and every 
item actually calls for the total process. 

Assuming this to be true, we should be able to save considerable time by 
using only a few well-chosen samples, those which experience shows to corre- 
late most highly with reflective capacity. A few well-chosen items often turn 
out a better test than any items not so well chosen. 

Now, what specifically do we propose? We want a test to measure two 
outcomes: (1) reflective ability ; (2) number or scope, and degree of harmoni- 
zation, of insights or understandings. Hence, tests items should represent 
problems—situations which a pupil cannot quite see through until he has given 
the matter a little thought. 

In all such test items, the maturation level of the pupils must be watched 
closely so that each item, though not immediately answerable, can be answered 
by a pupil who has the requisite understanding as soon as he has given it a 
bit of study. In other words, each test item should be novel but not too much 
so. Something within the realm of previous study should be put to the pupil 
in a form slightly different from anything previously experienced, but not so 
different as to make him unable to think it through in the time available. 

Since reflection-level items require more time for answering than do either 
memory-level or understanding-level, only a small number of reflection-level 
items may be used—perhaps three or four in a 40-minute period. And the items 
must be such as can be handled, each in fifteen or twenty minutes, by the pupils 
on the maturation level which they represent. 

In order that test items, besides being reflective, may measure scope and 
harmony of understandings, pupils must be given opportunity to indicate not 
only the answers or conclusions which they reach, but also the bases for arriv- 
ing at them. Whether capitals tend to be at.the geographical centers of their 
respective states or nations will probably depend upon the states or nations 
which are chosen. Therefore, a given pupil’s answer to such a question can 
hardly be judged fairly until the examiner knows what states or nations the 
pupil has in mind. 

Moreover, the point of view (or logic) employed by the pupil may make 
considerable difference as to what answer can be considered correct. For ex- 
ample, because of differences in points of view, certain acts of a governmental 
agency may be viewed as highly desirable by a Democrat and highly undesirable 
by a Republican. If a test is to indicate the quality of a pupil’s thinking, together 
with the scope and harmony of his insights, it must give the pupil opportunity 
to indicate not only the answers or conclusions which he considers proper, but 
also his reasons, both logical and factual, for arriving at his answers or con- 
clusions. In this respect objective tests seem to be practically unusable because 
they give pupils opportunity to indicate answers or conclusions only. 

It is true that certain objective tests have been devised, multiple choice in 
form, which first offer several answers from which one, the best, is to be chosen. 
Below these answers are presented several reasons, one of which will explain 
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why the indicated answer was chosen. However, such tests are not wholly 
satisfactory because, first, it is difficult to present an adequate array of reasons 
for the various possible answers and, second, it is possible to get a better ling 
on a pupil’s thinking if he expresses it in his own words rather than if he merely 
reacts to the words of someone else. 

In summary, therefore, we would say that as far as pencil-and-paper tests 
are used, they should ideally be thought-provoking, essay type, with few ques- 
tions and ample opportunity for a pupil to indicate not only his own answer 
but also the way he reached the answer. Evaluation of a pupil’s effort should 
be on the basis of adequacy and harmony—adequacy (and accuracy) of the 


‘data employed and harmony between data and conclusions. 


But we would add that formal pencil-and-paper tests should not be relied 
upon wholly. They should be supplemental to day-to-day observations of each 
pupil’s work in the light of questions such as: How reflective is the pupil 
under ordinary circumstances? How good are his day-to-day conclusions in 
the light of adequacy and harmony? How adequate and harmonious, at his 
maturation level, is his total outlook on life? Do his conclusions in one field of 
thought or endeavor tend to harmonize with those in other fields? Is the pupil 
open-minded, yet does he require convincing evidence before he changes his 
mind? This list of questions is merely indicative of what might be looked for; 
it is not exhaustive. 

Our view on testing, therefore, sums up to saying that we would havea 
teacher develop a testing program which fosters a genuinely American pro- 
gram for teaching. Memory-level tests are seldom if ever to be used, and 
objective tests seem not entirely suitable. 

But teachers are ordinarily very busy persons, and find that the advantage 
of quick scoring makes objective tests highly desirable if they can be used at all. 
Consequently, we would say that if objective tests must be employed, the items 
should be brought to the understanding level. Though the measurement is 
indirect, scores on understanding-level tests tend to correlate rather highly 
with reflective ability, and they can be made to indicate adequacy of outlook 
fairly well. Harmony of outlook will then have to be judged by the day-to-day 
observations which we have just described. Perhaps a well-designed essay tes 
can be given occasionally. 


We must redouble our efforts to live in harmony with our world neighbors, 
and, of more intimate concern to us, with our fellow Americans of differing 
cultural backgrounds—an achievement which will be an integral part of wit- 
ning the peace. The writers of the 17th Yearbook of the California Elementary 
School Principals’ Association, EDUCATION FOR CULTURAL UNITY, 
have looked at this problem from many angles. Some are specialists in thei 
fields. Others have written of their personal experiences. 

Copies of this new Yearbook can be obtained from Sarah L. Young, Parker 
School, Oakland 3, California. 





———_—$_$————— 
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State Representatzves—1945 -46 


ALABAMA 
Robert C. Johnston 
Birmingham, Ala. 


ARIZONA 
Edwon L. Riggs 
Phoenix, Ariz. 


ARKANSAS 
Mrs. Hazel H. Isgrig 
Little Rock, Ark, 


CALIFORNIA 
Daniel Gilson 
Oakland, Calif. 


CoLoRADO 
Nellie V. Lind 
Denver, Colo. 


CoNNECTICUT 
Caroline C. Jourdan 
New Haven, Conn. 


DELAWARE 
Mrs, Elva Dugan 
Wilmington, Del. 


District OF COLUMBIA 
Mrs. Maud Roby 
Washington, D. C. 


FLoRIDA 
Frances Belcher 
Clearwater, Fla. 


' GEORGIA 


Pauline Martin 
Decatur, Ga. 


IDAHO . 
M. Lillian McSorley 
Lewiston, Idaho 


ILLINOIS 
Joseph Murphy 
Peoria, Ill. 


INDIANA 
Charlotte Carter 
Indianapolis, Ind. 


lowa 
Esther Helbig 
Dubuque, Iowa 


KANSAS 
Myrtle M. Evans 
Kansas City, Kans. 





LouISIANA 
Loretta R. Doerr 
New Orleans, La. 


MAINE 
William M. Cullen 
Lewiston, Maine 


MARYLAND 
Mrs. Anna P. Rose 
Chevy Chase, Md. 


MASSACHUSETTS 
Alice L, Goodspeed 
Dedham, Mass. 


MICHIGAN 
Mrs. Verna Donlin 
Detroit, Mich. 


MISSISSIPPI 
Mrs. Betty Cantwell 
Clarksdale, Miss. 


MIssouRI 
Anna F, Edwards 
Kansas City, Mo. 


MONTANA 
Alice Lausted 
Billings, Mont. 


NEBRASKA 
Florence B. Reynolds 
Omaha, Nebr. 


New HAMPSHIRE 
Alice L. Jeffords 
Portsmouth, N. H. 


NEw JERSEY 
Ralph C, McConnell 
Atlantic City, N. J. 


New Mexico 


Charles L. Mills 
Hobbs, N. Mex. 


New York 
Mrs. Florine H. Elrey 
Mamaroneck, N. Y. 


NortH CAROLINA 
Florence M. Reid 
Greensboro, N. C. 


NortH DAKOTA 


R. D. Brown 
Fargo, N. Dak. 
OHIO 


Charles A. Thornton 
Shaker Heights, Ohio 


OKLAHOMA 
Ralph H. Kennedy 
Tulsa, Okla. 


OREGON 
W. C. Painter 
Portland, Ore. 


PENNSYLVANIA 
William J. Laramy 
Haverford Township, Pa. 


RuopE IsLANpD 
Marion B, Bray 
Providence, R. I. 


SouTH CAROLINA 
W. J. Castine 
Greenville, S. C. 


TENNESSEE 4 
Gerald L. Bell 
Knoxville, Tenn. 


TEXAS 
Thomas E. Pierce 
Denton, Texas 


UTAH 
Mrs. Lois Hinckley 
Salt Lake City, Utah 


VERMONT 
A. Viola Burns 
Rutland, Vt. 


VIRGINIA 
Lillian M, Johnson 
Norfolk, Va. 


WASHINGTON 
Arthur C, Gravrock 
Seattle, Wash. 


WEST VIRGINIA 
Rex Smith 
Morgantown, W. Va. 


WISCONSIN 
Phillip H. Geil 
Milwaukee, Wisc. 


WYOMING 
Margaret Chambers 
Casper, Wyo. 


ALASKA 
Harry L. Holt 
Kodiak, Alaska 


Hawall 
Mrs. Bessie Scobie 
Honolulu, Hawaii 
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Program of NEA Travel Service 


Mexican Tours—Two tours to Mexico of about 30 days each and probably a 
third of about 16 days will be offered next summer, The 30-day tours are sched. 
uled to begin June 15 and July 25, and the third late in August. Cities of 
departure for all Mexican tours will be St. Louis, Missouri, and Austin, Texas, 
Materials will be sent to travelers as soon as they complete registration. The 
price of the 30-day tour, on an all expense basis from St. Louis, is expected to 
be about $275. The shorter tour will be about $175. Tours originating in Austin 
will be about $50 less. 


* 

Regional Vacation Centers—Santa Fe, New Mexico, and either New England 
or North Carolina will be the sites of two Regional Vacation Centers to be estab- 
lished next summer. Recreational, social, and cultural programs for each center 
are being planned by committees of local teachers, administrators,. and rep- 
resentatives of PTA and other interested organizations. Centers will be in 
operation for three periods of two or three weeks duration each. The estimated 
cost is $25 and $35 per week. Responsibility and expense of travel to and from 
centers must be assumed by the visitors. 


Program Expansion—As the transportation and hotel situations improye, 
both foreign and domestic tour programs will be greatly expanded. Travel 
programs to both the Pacific and Atlantic coast regions of the United States and 
to several foreign countries will be developed for 1947. Eventually foreign tours 
should take our educators to most of the countries of the world. If the Regional 
Vacation Centers attract a sufficiently large number of teachers, other centers 
will be established in other interesting regions of healthful climate. 


Program Objectives—The association of our educators with men and women 
of various professions and occupations in countries and regions visited is 4 
primary objective in planning programs for all tours and centers. Historical 
background, natural environment, cultural landscape, and social and economic 
problems of the country or region are to be emphasized by specialists accom 
panying the groups. Cost to participants is to be kept as low as possible by 
operating all programs on a non-profit basis. 


For further details address Paul H. Kinsel, Director, NEA Division of 
Travel Service, 1201 16th Street, Washington 6, D. C. 
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